Audio description

an audio description is provided of all significant visual information in scenes, actions, and events that cannot be perceived from the sound track alone to the extent possible given the constraints posed by the existing audio track and limitations on freezing the audiovisual program to insert additional auditory description.


  1. The passage beginning “to the extent possible” is tautological and betrays an ignorance of audio description, which by definition is limited to the existing audio track.
  2. So-called extended descriptions, which involve freezing the main presentation to describe at length, are purely experimental at this point, but in any event there is no “limitation on freezing the audiovisual program.” In extended description, the describer can freeze the program whenever and wherever he or she wants.
  3. The term auditory description must be replaced by audio description everywhere in WAI documents.
  4. The grammar is rather poor.

Implementation issues

  1. It’s hard to defend blanket exemptions of programming categories or individual programs on television, but some categories or programs nonetheless may not require description online.
    1. Online video isn’t TV; it’s usually short and nonfiction in nature.
    2. WCAG should not make the mistake of attempting to itemize exemptible categories or programs, but its guidelines should acknowledge the need for occasional exemptions.
    3. A requirement to describe every programming type would exceed any known broadcasting requirement.
  2. There is no concession that live audio description has almost never been attempted and is unreasonably difficult even for TV broadcasting.
  3. As written, the guideline requires that Webcams be continuously described.



  1. The guideline should at least read “all significant dialogue and sound effects are captioned.”

Implementation issues

  1. How can “a transcript or other non-audio equivalent” be “synchronized with the events they represent”?
  2. The use of the term significant will give producers license to caption only a reduced portion of the original audio.
  3. The phrase “real-time and audio-only and not time-sensitive and not interactive” is impossible to understand. It appears to require a transcript for any kind of Internet radio – all day, every day, forever.
  4. What is a possible “other non-audio equivalent”? If the guideline attempts to say that providing sign-language interpretation is sufficient, WAI needs to understand that it would thereby sanction foreign-language translation, which wouldn’t serve even the entire deaf/hard-of-hearing population and would open up a very large kettle of fish.

Real-time video

if the Web content is real-time video with audio, real-time captions are provided unless the content:


  1. The guideline does not distinguish among:
    1. live programs streamed live
    2. live programs stored and streamed later
    3. pre-recorded programs streamed as they are first broadcast
    4. pre-recorded programs stored and streamed later
  2. The guideline has no allowance for repeat Webcasts.

Implementation issues

  1. The guideline essentially requires every Web-based TV service to caption everything, 24 hours a day, forever. Yet there is no known television broadcaster with such a requirement – even the Canadian broadcasters with 100% captioning requirements (e.g., CBC, Newsworld, Movie Network, Viewer’s Choice) have small exemptions, as for outside commercials or promos.
  2. Assuming the guideline uses the term “real-time” to mean “live,” note that stenocaptioning can only be reliably done in English. French-, Spanish-, and Italian-language broadcasts are also real-time-captioned, but with much greater difficulty. Some languages have no real-time captioning whatsoever, and for those languages there is no method to meet the guideline.
  3. There is no reliable way to produce online closed captions, using media players’ scripting or text formats, that works in the real world. Magpie and similar SMIL/SAMI generators are not production-ready.
  4. Authors or producers or Webcasters are not qualified captioners.
  5. Unlike broadcasting, where the cost of setting up a station is astronomical, the cost of streaming video is much smaller. Captioning costs might exceed broadcasting costs.


If the Web content is real-time non-interactive video (e.g., a Webcam of ambient conditions), either provide an equivalent that conforms to checkpoint 1.1 (e.g., an ongoing update of weather conditions) or link to an equivalent that conforms to checkpoint 1.1 (e.g., a link to a weather Web site).

Implementation issues

  1. What about Webcams the viewer can operate remotely? Aren’t those “interactive”? Doesn’t the person manipulating the Webcam become the author of the content, thus responsible for making it accessible?
  2. How can unstaffed Webcams ever be provided with an “equivalent”? That would imply a human being watching the output and selecting an “equivalent” of what the Webcam currently displays.
  3. The examples given are not reasonable. The appearance of the outside world is not the same as its weather. (Clear skies could be cold or warm.)
  4. There is no method given for the equivalents. The alt text of a Webcam still image cannot link to “an ongoing update of weather conditions,” though the author could link the entire image to such a site.

“Pure audio”

if a pure audio or pure video presentation requires a user to respond interactively at specific times in the presentation, then a time-synchronized equivalent (audio, visual or text) presentation is provided.

Implementation issues

  1. The circumstance to which the guideline applies is not clear. What is, for example, a “pure[-]audio presentation [that] requires a user to respond interactively at specific times in the presentation”? How is that different from a phone call?
  2. The guideline appears to require live captioning of an audio-only stream (which has no visual component to which to add captions) and live description of a video-only stream (which may or may not have an audio component that could be used for descriptions). It’s theoretically possible to do both, but so massively difficult and expensive as to be impracticable here in the real world.
  3. The mechanisms used for live captioning of an audio feed would undoubtedly be prohibited by WCAG 2.0’s likely anti-JavaScript, anti-multimedia, anti-everything-but-plain-text provisions. There’s also no precedent for captioning an audio-only feed in other audiovisual media (though there was, in years past, some discussion of transmitting captions by radio subcarrier).
  4. The blandishment “a time-synchronized equivalent (audio, visual or text) presentation” is at best ungrammatical and emphasizes the fact that WCAG cannot even say what it means. By a plain reading of the guideline, an author could provide an audio “equivalent... presentation” with an audio original.


if content is rebroadcast from another medium or resource that complies to broadcast requirements for accessibility (independent of these guidelines), the rebroadcast satisfies the checkpoint if it complies with the other guidelines.

Implementation issues

  1. Essentially, this guideline admits that existing broadcaster guidelines trump WCAG. Quite possibly something along these lines is the correct course of action.
  2. The guideline fails to state that the “content” must be “rebroadcast” with the accessibility features intact. But how? Closed or open? Single stream or multiple?


Display features of text equivalents

Text equivalents should be easily convertible to Braille or speech, displayed in a larger font or different colors, fed to language translators or abstracting software, etc.

Implementation issues

  1. Text equivalents are to be displayed only if the original item cannot be. According to the cascade principle, their display characteristics should match those of the items for which they are “equivalent.” (Only Gecko does this properly with alt texts, for example, styling them according to the parent element or other declaration.)
  2. When the guideline says that equivalents “should” be “easily convertible to Braille or speech... etc.,” it implies that authors must set things up so that such conversions are “easy.” That is, at best, a user-agent issue, if it is even possible or permitted by the specs.

Using colour for structure

  1. for visual presentations, use font variations, styles, size and white space to emphasize structure.
  2. use color and graphics to emphasize structure.
  3. if content is targeted for a specific user group and the presentation of the structured content is not salient enough to meet the needs of your audience, use additional graphics, colors, sounds, and other aspects of presentation to emphasize the structure.


  1. A “style” or a “size” is a “font variation.”
  2. Salient means “noticeable.” Is that what the guideline truly means?

Implementation issues

  1. The entire thrust of the Web Content Accessibility Guidelines 2.0 discourages anything resembling graphic design. Colour is particularly discouraged. How does WCAG 2.0’s disparagement of visual sophistication square with this advice to use visual methods to “emphasize” structure?
  2. What will WCAG 2.0 require in the way of fallback presentations? WCAG 1.0 was obsessed with making every Web page work in the dumbest imaginable devices – no stylesheets, no scripting, no nothing but 1995-era HTML rendering. How will the current advice comply with 2.0’s separate requirements for “backward compatibility”?
  3. Name three ways in which one could use “graphics” to “emphasize structure.”
  4. It’s tautological to give advice to use further presentational features if the “presentation” is not “salient” enough? What author would not add further presentational features if the author didn’t like the default presentation? Which authors ever like the default presentation?

Foreground/background combinations

text content is not presented over a background image or color or the colors used for the text and background or background image pass the following test: (no tests/algorithms are available at this time)


  1. Testing is a non-starter. Unlike colour deficiency, colour preferences for easier reading are not cut and dried and can never be simplified into a formula, which is the sort of test WCAG 2.0 contemplates.

Implementation issues

  1. The guideline ignores the fact that the three main forms of colourblindness are the central accessibility issue in colour selection. Nothing else is even in the same league.
  2. Text content is always presented over a background colour, save for the nonsense case where foreground and background colours are identical.
  3. If an author with no visual impairment can read the content, that author cannot reasonably predict which figure–ground combinations will or will not be accessible to someone with a visual impairment.
    1. The converse is, however, true: If a sighted author finds the colour combinations hard to read, a visually-impaired person probably will, too.
    2. Beyond that kind of plainly-evident illegibility, there is no way to predict what a visually-impaired visitor will require.
  4. The guideline essentially dictates æsthetics. Under the guise of making text legible, it essentially tells authors never to attempt to make a page attractive because that would usually require a foreground colour and a background colour or image.