Dear HTML 5 Working Group,
The member publishers of the Association of American Publishers have put together some commentary on Issue #30 (longdesc) and how it impacts their current and future planned use of HTML. Please find this attached as a Word document.
We appreciate your taking this commentary into consideration,
Suzanne Taylor
Chair, Higher Education Accessible Technology Working Group
Association of American Publishers
Ed McCoyd, Esq.
Executive Director for Digital, Environmental & Accessibility Affairs
Association of American Publishers
71 Fifth Avenue, 2nd Floor
New York, NY 10003
Telephone: (212) 255-1851
emccoyd@publishers.org

Here is a text version of attachment in Comment 1:
The longdesc HTML Attribute & Educational Publishing
As educational publishing companies increasingly offer content in
digital formats, accessibility mechanisms that support complex
instructional materials become increasingly important. The Association
of American Publishers submits this comment regarding the HTML 5 May 25
2011 Working Draft <http://www.w3.org/TR/2011/WD-html5-20110525/>, Issue
30 <http://dev.w3.org/html5/status/issue-status.html#ISSUE-030>
(longdesc) in the hope that the longdesc attribute, or some mechanism
that provides the same benefits, will be available in HTML Version 5 and
beyond.
About the Association of American Publishers
The Association of American Publishers is the national trade association
of the U.S. book publishing industry. AAPs more than 300 members
include most of the major commercial publishers in the United States, as
well as smaller and non-profit publishers, university presses and
scholarly societiessmall and large. AAP members publish hardcover and
paperback books in every field, educational materials for the
elementary, secondary, postsecondary, and professional markets,
scholarly journals, computer software, and electronic products and services.
Text Alternatives in Educational Publishing: Technology Requirements
Most images on the Web can be made accessible with about 25 or fewer
words <#ftn1> of plain text. In HTML, the alt attribute makes this
possible.
Instructional materials often use complex images (including photos,
graphics, diagrams, and maps) to illustrate concepts. For example, an
image may provide a real-world instance of a concept, which may make an
abstract discussion clearer to students. In such cases, students and
instructors with visual impairments need access to the same information.
Twenty-five or fewer words of plain text are not always sufficient to
convey the meaning of such visual content. Conveying information
effectively may not only involve more text but often requires some
combination of headings, nested lists, and data tables. In this
document, we refer to this as structured text. The benefits of
structured text for image descriptions were shown in a study conducted
by the National Center for Accessible Media, WGBH
<http://ncam.wgbh.org/experience_learn/educational_media/stemdx>. Refer
to this study for examples. Structured text allows a screen reader user
to choose navigation paths through data, much as a user who can see the
image might follow various paths visually while analyzing an image.
Structured text also gives us the opportunity to use discipline-specific
markup, such as MathML when, for example, parts of an image are labeled
with math notation. Headings allow labeling and dividing text into
shorter segments, to help students understand the schema for a specific
area of knowledge. Semantic tagging also allows users with screen
readers to move more quickly through material if they choose. And, once
the W3Cs WAI-ARIA specification <http://www.w3.org/WAI/intro/aria> is
further implemented, aria-flowto will provide another structure for
representing flow charts and decision trees in structured text.
Use of structured text as a text alternative for an image is supported
in HTML through the longdesc attribute. Though there are other options
for presenting structured-text, the longdesc attribute provides
following benefits:
For User Experience
* The longdesc attribute is a dedicated mechanism for just this
purpose, and it always works in the same way:
o Students and instructors will find the same user interface
throughout all materials, so they will not need to learn new
interfaces product-to-product, which takes time and attention
away from the learning content.
o The longdesc attribute can be revealed programmatically through
browser extensions, providing access for users who do not use
screen readers. Many users benefit from text alternatives,
especially users with low vision.
o The longdesc attribute does not impact the visual design. So,
authors do not have to worry about how the text might impact the
visual user experience. Authors can, therefore, focus on the
experience of students and instructors with visual impairment
while they write text alternatives. This focus on the primary
audience helps authors create text that is well-suited for its
purpose.
For Production Processes and Quality Assurance:
* The longdesc attribute is easy to code. There is no need for custom
scripting.
* The longdesc attribute works with assistive technology today. If the
longdesc attribute continues to be supported, content that works
well for users today can be used in future products without editing.
* The longdesc attribute can be programmatically recognized and
tracked, allowing publishers to locate existing long descriptions
and to test for the presence of long descriptions.
We are using longdesc increasingly in our products. Unless a different
mechanism is created that meets all these requirements, we urge the W3C
to keep the longdesc attribute in HTML specifications moving forward.
We do acknowledge that user agent support for the longdesc attribute
should be improved. In particular, users who have low vision or who find
image descriptions helpful for any reason should be able to set their
user agent to reveal the descriptions. The HTML 5 specification should
clarify that user agents should provide this functionality in addition
to passing information to assistive technologies. In this case,
publisher documentation for products with numerous longdesc attributes
might include tips about use of these user agent settings.
Evaluating Other Solutions
We discuss the aria-describedby attribute following to illustrate that
solutions that at first seem to duplicate the qualities of the longdesc
attribute may not actually be as useful when implemented.
The aria-describedby attribute takes the unique indentifier (ID) of
another object on the same page as its value. In other words, it points
to another object (e.g. a paragraph or a link) on the page. This
attribute could become an effective way for developers to indicate that
the information provided by an image is actually redundant with other
information on the page.
Screen reader developers might implement this attribute so that it is
silent in screen readers when used on an image by default. They might
also allow those who want additional information to set their screen
reader to announce aria-describedby and to provide a way to jump to the
object indicated by the attribute. An instructor, for example, might
choose this setting to be aware of what sighted students will be
experiencing.
But, the aria-describedby attribute falls short as a mechanism to link
to a separate page of structured text. The aria-describedby attribute
could point to a link on the same page as the image, but:
* Hiding the link visually would require custom CSS or scripting. The
mechanism for hiding the link would therefore differ
product-to-product, making browser extensions or features to show
the links more complex to code and less reliable for users.
* The link would have to be present on the page for screen reader
users, creating redundancy for those users.
* Since the aria-describedby attribute points to a link or to other
content on the same page, its structure implies a two-step process
to reach the text alternative. Compared with longdesc, the two step
process is more tedious:
1. The user moves to the object that aria-describedby references.
2. If the object is or contains a link or a button, the user
interacts with that object to move to the text alternative.
If the issues above are resolved and aria-describedby is used as a way
to access descriptions that are otherwise hidden from all users
(including screen reader users), another problem emerges. In that case,
aria-describedby cannot be silent by default in screen readers when used
on images, compromising its use to illustrate that the content of an
image is already available on the page. Developers may not realize the
distracting and frustratingly circular user experience that this would
cause and might use aria-describedby to point to, for example, a
paragraph just above the image. Users would then likely follow the
aria-describedby announcement, expecting to find additional content, but
they would arrive, instead, at a paragraph that they have likely just read.
We urge the W3C HTML Working Group to write out the expected
implementation and user experience details of any proposed replacements
for the longdesc attribute to be sure that they will be at least as
effective as the longdesc attribute in practice.
<#ftn1> Educational publishers often localize materials, and
different languages have different average word lengths. We also want to
encourage use of vocabulary from the main text in our text alternatives.
So, we find word count more useful as a measure of text alternative
length than character count.

(In reply to comment #5)
> Thanks, plh.
>
> (In reply to comment #4)
> > * The longdesc attribute is a dedicated mechanism for just this
> > purpose, and it always works in the same way:
>
> Indeed. It is ignored due to widespread misuse.
If you want to talk about misuse, then let's also remove the table element.
Rather than obsolete longdesc, strengthen the information about longdesc in the specification and ensure that browsers support it fully so that page authors/developers can verify its correct use.
It is as simple as that.

These requirements appear to be better met by using a normal link, with a programmatically determinable association where required (eg a rel attribute <a href="ld.html" rel="longdesc"><img></a>). As I understand it James Craig from PFWG has previously suggested using normal links to Suzanne, is there some other usage scenario for publishers that specifically requires the longdesc attribute? Can you describe it?
https://twitter.com/#!/cookiecrook/status/22096374710> The longdesc attribute works with assistive technology today.
With *some* AT today. longdesc is unsupported in popular SRs e.g. VoiceOver, Orca and NVDA, as well as some other AT e.g. screen magnifiers, and is also unavailable to IE, Firefox, Chrome and Safari users by default. Support also seems to be poor in the DTB sector:
http://diagramcenter.org/index.php?option=com_content&view=article&id=24&Itemid=28
Using a normal link has the benefit that it works for everyone now, does not require software upgrades or user retraining, and W3C-WAI are not planning to obsolete it. See also RNIB and WebAIM advice:
http://webaim.org/techniques/images/longdesc
[Word .doc] http://www.rnib.org.uk/professionals/Documents/Background%20and%20guidance%20-%20RNIB%20Surf%20Right%20web%20accessibility%20v1%200.doc> We are using longdesc increasingly in our products.
Can you give us some examples?
Are there any scenarios where using a normal link on the image, with rel="longdesc" if required, does not provide an immediate, significant improvement in accessibility and usability, while still meeting all the requirements you listed?

(In reply to comment #8)
> Using a normal link has the benefit that it works for everyone now, does not
> require software upgrades or user retraining, and W3C-WAI are not planning to
> obsolete it. See also RNIB and WebAIM advice:
...
> Are there any scenarios where using a normal link on the image, with
> rel="longdesc" if required, does not provide an immediate, significant
> improvement in accessibility and usability, while still meeting all the
> requirements you listed?
The AAP posting seems to reject the use of anchors:
* Hiding the link visually would require custom CSS or scripting. The
mechanism for hiding the link would therefore differ
product-to-product, making browser extensions or features to show
the links more complex to code and less reliable for users.
* The link would have to be present on the page for screen reader
users, creating redundancy for those users.
The AAP seems to be looking for a long-term solution, other than using anchor tags:
"We urge the W3C HTML Working Group to write out the expected implementation and user experience details of any proposed replacements for the longdesc attribute to be sure that they will be at least as effective as the longdesc attribute in practice."
ARIA markup looks like a viable solution:
"Semantic tagging ... once the W3C's WAI-ARIA specification is further implemented". (AAP)
From the ARIA spec:
http://www.w3.org/TR/wai-aria/states_and_properties#aria-flowto
"In the case of one or more IDREFS, user agents or assistive technologies SHOULD give the user the option of navigating to any of the targeted elements"
Using CSS display none and ARIA flowto may address the issues posted, without causing visual artifacts, allowing the viewer to chose between the extended description or typical document flow, without additional anchors.
With the canvas tag, we use non-visible content to represent a description of the visual media. The img tag does not have a similar option for child nodes (though the svg image tag does). It's preferable to include alternate content within the same document. SVG image and HTML canvas avoid longdesc by using the element subtree.
(long term) Example:
<img aria-flowto="nextParagraph imgDescription" aria-owns="imgDescription" alt="Pie chart illustrating the reasons people like pie." />
<div aria-label="Analyzing the pie chart" style="display: none" id="imgDescription">This pie chart is broken up into 9 sections...</div>
<p id="nextParagraph" aria-label="Discussing the pie chart">
Pie is excellent, everyone can see that...</p>
Note: The long description here is not visible to users, as it has the css display none style applied. This removes it from the rendering tree as well as diminishing the ability of elements inside of the tree from receiving events. Image content and long description content is typically static.

> (long term) Example:
>
> <img aria-flowto="nextParagraph imgDescription" aria-owns="imgDescription"
> alt="Pie chart illustrating the reasons people like pie." />
> <div aria-label="Analyzing the pie chart" style="display: none"
> id="imgDescription">This pie chart is broken up into 9 sections...</div>
> <p id="nextParagraph" aria-label="Discussing the pie chart">
> Pie is excellent, everyone can see that...</p>
>
> Note: The long description here is not visible to users, as it has the css
> display none style applied. This removes it from the rendering tree as well as
> diminishing the ability of elements inside of the tree from receiving events.
> Image content and long description content is typically static.
Thanks for sharing this solution. I personally think that this solution is very clever. (CSS display:none content is typically hidden from screen reader users, so you'd have to position it off screen instead).
I also personally think improving longdesc is the easiest way to move forward.
But, I want to clarify that the AAP commentary simply says just that something at least as good as longdesc is needed and asks that the full user experience be considered as solutions are considered. That commentary is not for or against any particular solution. It also suggested that longdesc's implementation details should be improved.
In finding a solution, the specification-writing community needs to not only make sure the data is the in page, but also make sure the user experience is good (from coding, to screen reader use, to browsing without screen readers but needing the long descriptions). Answering questions like those below is the key to this:
With such coding, how could you write a browser feature that would show all long descriptions? The solution above uses nothing unique to long descriptions, and so long descriptions would be difficult/impossible to detect.
Also, what is the over-all screen reader user experience? For example, in the solution above, how do you get back from the description, if you took that path? I'm sure a way could be specified. There might be a keyboard command to move to the previous step in aria-flowto. Also, if you just want to stay with the main text, and skip any image descriptions, how do you know which to choose "Analyzing the pie chart" or "Discussing the pie chart"? This is the type of analysis the AAP paper was requesting for any proposed replacements.
Finally, how does the skill set required to code this compare with the skill set required to code longdesc? It seems that (with a little further discussion/refinement on this solution) this could be used for hidden long descriptions while aria-describedby is used when the description is on the same page, allowing a different user experience for the 2 different situations - that's great. longdesc and aria-describedby make a similar duo. But, which duo is more complex to teach/code?

> Are there any scenarios where using a normal link on the image, with
> rel="longdesc" if required, does not provide an immediate, significant
> improvement in accessibility and usability, while still meeting all the
> requirements you listed?
(note: I can't speak for AAP offhand.)
Is there a proposal for W3C to define how rel="longdesc" should function in browsers/AT? I did not see this in the specification.
Other than preventing the image from being a link somewhere and having a long description at the same time, as far I can personally tell, this provides the data that is needed.
But, will the data be used by browsers to show where there are long descriptions? Or, will users without screen readers have to click every image to find out if there's a long description? (same issue longdesc has now.)
Similarly, will the data be used by AT? Or will users have to hear the generic link announcement, such as "link image Starry Night" and wonder if the link is to a page about the painting or to a description of the painting?
I'm sure rel="longdesc" could be refined through a specification of the implementation details to work well for this. HTML 5 might even allow nested <a href ...> tags if one is rel="longdesc", allowing an image to both be a link and have a long description.
Personally, I'd favor improving longdesc--just seems logical to me. But, if the user experience details are specified for rel="longdesc" solving the issues mentioned above, that seems fine to me too.
I can't speak for AAP offhand. But, the commentary is more concerned that there be a solution with a good user experience/developer experience and much less concerned with what the exact code is.

(In reply to comment #12)
> > (long term) Example:
> >
> > <img aria-flowto="nextParagraph imgDescription" aria-owns="imgDescription"
> > alt="Pie chart illustrating the reasons people like pie." />
> > <div aria-label="Analyzing the pie chart" style="display: none"
> > id="imgDescription">This pie chart is broken up into 9 sections...</div>
> > <p id="nextParagraph" aria-label="Discussing the pie chart">
> > Pie is excellent, everyone can see that...</p>
> >
>
> Thanks for sharing this solution. I personally think that this solution is very
> clever. (CSS display:none content is typically hidden from screen reader users,
> so you'd have to position it off screen instead).
The content of longdesc is hidden, as well, from the initial view. I'd hope that through aria-flowto, the AT would lower the priority the display none semantic. Of course, if that's not the case, there are many, many visibility methods, such as offscreen positioning. For modern browsers, there's <canvas height="0" width="0">alt content</canvas> and style="transform: scale(0,0)", and all of that. Those kind of issues are really about supporting specific User Agents/secondary user agents (such as IE6 or JAWS9, etc).
> With such coding, how could you write a browser feature that would show all
> long descriptions? The solution above uses nothing unique to long descriptions,
> and so long descriptions would be difficult/impossible to detect.
>
Items such as aria-expanded and aria-haspopup, as well as aria-controls and aria-owns provide authors an opportunity to mark-up their code with more information, they also provide a standard means for other developers to identify expanded and minimized content. I would recommend aria-expanded and such semantics, for showing and hiding descriptions. There are various roles and html5 tags (such as figcaption) which can also add semantics to the elements, and in the worst case, data-* can be used on a per-site basis.
> Also, what is the over-all screen reader user experience? For example, in the
> solution above, how do you get back from the description, if you took that
> path? I'm sure a way could be specified. There might be a keyboard command to
> move to the previous step in aria-flowto. Also, if you just want to stay with
> the main text, and skip any image descriptions, how do you know which to choose
> "Analyzing the pie chart" or "Discussing the pie chart"? This is the type of
> analysis the AAP paper was requesting for any proposed replacements.
>
These are going to be AT specific -- as things typically are. In the example I shared, the main text and the image description are both options in flowto -- but only the image description is set in the aria-owns semantic -- this lets the AT know that the description is a long description for the image. As there is only one other path, the AT can continue with it.
One of the nice things about ARIA, is that the scripting environment can also base its behavior on ARIA semantics. One could use this vocabulary: role='img" aria-flowto + aria-owns, and aria-expanded, to drive the scripting environment, instead of using private variables. This allows the scripting engine to rely on the DOM state instead of an opaque, internal state. And that's handy.
> Finally, how does the skill set required to code this compare with the skill
> set required to code longdesc? It seems that (with a little further
> discussion/refinement on this solution) this could be used for hidden long
> descriptions while aria-describedby is used when the description is on the same
> page, allowing a different user experience for the 2 different situations -
> that's great. longdesc and aria-describedby make a similar duo. But, which duo
> is more complex to teach/code?
What is our baseline for the skill-set? The vocabulary used is larger than "longdesc", with its minimal vocabulary, but that also means more room for semantic expression, and in a sense, a more rigid environment. With longdesc, a scripting author might have to use onclick and window.open to share a description, with ARIA, an author might use aria-expanded and display, instead. setAttribute and getAttribute are a little bit easier in that minimal case. ARIA is more complex, but that complexity benefits all users, if the markup is added. The same applies to adding html5 markup and other such roles. It should be there anyway, even if longdesc is used, as it enhances AT.
For the widest possible support, I'd include longdesc as a legacy/backup, but expect that the vast majority of ATs support ARIA. Targeting legacy environments is a chore unto itself.

As far as status on this bug, I just want to note here that there's no action for an editor to take until after the group and the chairs complete re-consideration of issue-30, which is currently at the stage that the chairs are reviewing change proposals.
http://dev.w3.org/html5/status/issue-status.html#ISSUE-030

Suzanne, the HTML Image Description Extension (longdesc) specification is now a W3C Proposed Recommendation: http://www.w3.org/TR/2014/PR-html-longdesc-20141204/
W3C Member reviews are due by 16 January, as per the information in the section "Status of This Document"...
Closing the bug.