This document contains my comments regarding WAI’s main publication, the Web Content Accessibility Guidelines 2.0 (WCAG 2).
Every attempt has been made to make WCAG 2.0 and the related documents listed above as readable and usable as possible while retaining the accuracy and clarity needed in a technical specification. Sometimes technical terms are needed for clarity or testability.
In fact, at 72 pages and 20,800 words, the WCAG 2 main document is half a book’s length and is studded with jargon. Informed people with no disability whatsoever will find it hard to understand. The Working Group has failed to deliver a standards document that can be understood unto itself without reference to two other documents (notably the Understanding document, at twice the length of the actual WCAG 2).
The primary natural language or languages of the Web unit can be programmatically determined.
lang="mul". Support for that code will be rather questionable in real-world devices, and its existence came as a surprise even to Richard Ishida of the W3C, who wrote a Xerox paper that mentioned it.
The WCAG 2 main document contains a glossary that actually builds on other authorities and glossaries. The Working Group appears to have given up its arrogant and ignorant assumption that only it may write definitions for terminology.
Nonetheless, there are some anomalies:
- audio description
- narration added to the soundtrack to describe important visual details that cannot be understood from the main soundtrack alone. [...] Audio descriptions of video provide information about actions, characters, scene changes, and on-screen text. [...] In standard audio description, narration is added during existing pauses in dialogue.
- captions
- text presented and synchronized with multimedia to provide not only the speech, but also sound effects and sometimes speaker identification
The term they’re looking for here is “non-speech information, including meaningful sound effects and identification of speakers” (the latter a slightly different sense than “speaker identification,” which seems to require explicitly naming the speaker).
Note: In some countries, the term “subtitle” is used to refer to dialogue only and “captions” is used as the term for dialogue plus sounds and speaker identification. In other countries, subtitle (or its translation) is used to refer to both.
Those other countries are wrong. Captioning is captioning and subtitling is subtitling. WCAG 2 should not muddy the waters by giving any credibility to errors of nomenclature in other English dialects.
- label
- text, image, or sound that is presented to a user to identify a component within Web content
Apparently it’s possible to label something solely with a sound. Doesn’t a sound, like an image, then require a text equivalent? Don’t you always end up with text?
And doesn’t this ban the use of video or multimedia as a label? I’m not proposing such a thing, but it seems no less palatable than using an image or a sound as a label.
- sign-language interpretation
- translation of spoken words and other audible information into a language that uses a simultaneous combination of handshapes, facial expressions, and orientation and movement of the hands, arms, or body to convey meaning
One may translate only spoken words? Under WCAG, it becomes illegal to translate from one sign language to another. It also becomes illegal to do what Canada’s Copyright Act permits – translate a written work into sign language.
- text
- sequence of characters[.] Note: Characters are those included in the Unicode/ISO/IEC 106464 repertoire.
One may use only characters in Unicode. Given that several scripts are unencoded in Unicode, this may present a problem. Some East Asian languages are more robustly published with legacy encodings even if that is “improper.”
I repeatedly tried to explain to the Working Group that all that matters is a defined and understandable character encoding.
- text alternative
- programmatically-determined text that is used in place of non-text content, or text that is used in addition to non-text content and referred to from the programmatically-determined text
Hence a title attribute absolutely is a text equivalent. An image with empty alt plus a title containing what would otherwise be the alt text will pass WCAG 2. (There is some overlap here with UAAG, which can be interpreted to allow presentation of title text instead of regular or alt text.)
- used in an unusual restricted way
- words used in such a way that users must know exactly what definition to apply in order to understand the content correctly
As opposed to when?
When is it possible to misunderstand the definition yet still “understand the content correctly”?
- variations in presentations of text
- changes in the visual appearance or sound of the text, such as changing to a different font or a different voice
Web authors now have to worry about sound of text? How, exactly? We don’t control people’s screen readers. (Does this mean we have to find a way to mark up intonation and prosody in our podcasts? How, exactly?)
WCAG 2 violates WCAG 1 by listing an equation (for brightness as used in the general flash threshold) as plain text. In so doing it pointlessly explains that “the ^ character is the exponentiation operator.” I thought this kind of thing is what we had MathML for.
The W3C Process (capital letter sic) is seriously broken – or at least WCAG Working Group’s application of it is.
We are starting to gather implementation examples during this Last Call review process. Implementation examples are examples of pages or sites that conform to the proposed WCAG 2.0 at various levels of conformance.
I don’t see any evidence that the Working Group is “starting to gather” anything. I don’t see evidence that they’re looking for or soliciting “implementation examples,” which in any event are virtually nonexistent. WCAG 2, after all, wasn’t released in anything resembling a final version until late April 2006. There hasn’t been time for authors, even if they wished to comply with WCAG 2, to take measures to do so. (Then there is the fact that there is no payoff for authors to comply with a specification that, first of all, isn’t final yet and, second of all, that they may seriously disagree with.)
The first public Working Draft of WCAG 2.0 was published 25 January 2001. Since then, the WCAG WG has published nine Working Drafts, addressed more than 1,000 issues, and developed a variety of support information for the guidelines.
Exactly how these 1,000 issues were “addressed” is open to dispute. Start with the use of a Mozilla Bugzilla database as a front end for bug reports. It’s a remarkably inaccessible form, and baffling even to a nondisabled expert. It’s true that many, possibly hundreds, of bug reports were remedied by rewriting the spec, but it’s also true that many bug reports were simply ignored (with responses that boiled down to “We don’t agree this is a bug”).
At time of writing, WCAG Bugzilla had 27 open bugs.
If a success criterion relates to a feature, component or type of content that is not used in the content (for example, there is no multimedia on the site), then that success criterion is met automatically.
What should happen is that the success criterion is not applicable. You can’t pass a guideline that doesn’t apply to anything in your document. By that logic, we’d all be awarded gold medals in the 100-metre dash just for not showing up.
The WCAG Working Group sometimes does a fine job articulating ideas that are incorrect in the first place.
After many, many warnings that they were making a series of mistakes and were not considering real-world Web sites, which they apparently never read, the WCAG Working Group went right ahead and listed the following for text equivalents to “non-text content”:
If non-text content presents information or responds to user input, text alternatives serve the same purpose and present the same information as the non-text content. If text alternatives cannot serve the same purpose, then text alternatives at least identify the purpose of the non-text content.
How do I “present the same information” – note, the same information – if my non-text content is, say, a thumbnail image of the front page of a newspaper? That’s a lot to retype into an alt text, don’t you think?
Again after many unheeded warnings, the Working Group published the following guideline for multimedia (at the highest level):
Sign-language interpretation is provided for multimedia.
First of all, which sign language? For an English-language source, no fewer than five distinct, if not always mutually unintelligible, sign languages can be identified (American, British, Irish, Australian, New Zealand).
More importantly, WCAG now requires translating a document (a multimedia file) into another language as a claimed accessibility provision. To restate the same question I have been posing for years, what prevents a Ukrainian-speaker from demanding that a Web site be translated into Ukrainian? After all, in both cases the issue is the incomprehensibility of the language of the original, not the disability. (A deaf person is not necessarily unable to read. Deaf people can and do understand and communicate in written language. A reliance on sign language, or even a preference for it, does not logically follow from being deaf.)
Following these guidelines will also make your Web content more accessible to the vast majority of users, including older users. It will also enable people to access Web content using many different devices – including a wide variety of assistive technologies.
The concepts of scoping, baseline, and target audience are so misguided as to derail WCAG’s entire project. The first two topics were addressed in my A List Apart article. The last one deserves mention here.
Information about audience assumptions or target audience. This could include language, geographic information, or other pertinent information about the intended audience. The target-audience information CANNOT specify anything related to disability or to physical, sensory or cognitive requirements. [Overwrought emphasis sic]
In other words, even if you extensively test your site and can demonstrate the following is true, you cannot state that your site is accessible to people with disabilities. While the guideline appears to be intended to make it impossible to declare, for example, that a site is not meant to be used by blind people, it also becomes impossible to state that it provably can be used by them.
Nobody can understand what the hell a “Web unit” is. In the following explanation –
A Web unit conforms to WCAG 2.0 at a given conformance level only if all content provided by that Web unit (including any secondary resources that are rendered as part of the Web unit) conforms at that level.
– what happens if I have a page full of thumbnail images, each with correct alt text as required and each of which links to an image file of a larger version of the picture? Since the image by itself has no HTML or other markup, it’s impossible to write an alt text for it. Is this not a “secondary resource”? If it isn’t, does it not then constitute a “Web unit” unto itself? Since Web units that are simple image files cannot be made accessible, doesn’t WCAG 2 essentially ban freestanding image files? 
(We are later told that linking to nonconforming content “is not prohibited” – gee, thanks – but only if “the content itself is [not] a Web unit within the set of URIs to which the conformance claim applies.” Hence if my freestanding image is still hosted on my site, I have to make it comply with my conformance claim, which at the very least requires a text equivalent, in turn meaning I have to wrap the image file in HTML. But by the time you the site visitor have selected and loaded that expanded image, you will already have had a chance to read the alt text on the thumbnail image.)
WCAG 2 is nearly consistent in pretending that Web standards do not exist (with one curious exception that I’ll get to shortly). Some teenagers have greater understanding of valid, semantic markup than the Working Group does, as evinced in passages like these:
Information that is conveyed by variations in presentation of text is also conveyed in text, or the variations in presentation of text can be programmatically determined.
Now, what does “presentation” mean? Really?
Doesn’t the requirement to convey the information in text make it possible to write instructions for an online form as follows?
- Fields marked in red are required.
- Fields marked in green are optional but recommended.
I have just “conveyed” the colour differences. (It so happens that the colours are exactly the rare ones that are confusable to colourblind people.)
If I am using markup to vary presentation of text, as one typically will (how else do you do it if you aren’t using a picture of text?), how is that markup ever not programmatically determinable? The browser had to read it to vary the presentation in the first place. All the usual elements, like em, strong, b, i, and u, are understandable by a machine. So is CSS, even at the simple level used in this document as a demonstration (span class="red" or ="green"). More complex CSS selectors, like :last-child, are also programmatically determinable.
In essence, for any author using markup, even lousy presentational markup, how is it possible to flunk this criterion?
Some parts of Web accessibility are not under the control of the author. The user agent, like a browser or screen reader (the latter of which is definitely included in WCAG 2’s definition of “user agent”), has a significant role to play. Nonetheless, WCAG lists these requirements:
More than one way is available to locate content within a set of Web units where content is not the result of, or a step in, a process or task.
Why can’t people be expected to simpy use the Find command in their browsers, or the back button?
The same issue reappears in that classic bugbear of Web-accessibility pedants, hyperlinks:
Each link is programmatically associated with text from which its purpose can be determined. [...] The purpose of each link can be programmatically determined from the link.
“Purpose”? Doesn’t Slashdot have an enormous mass of code in its system to prevent people from linking to notorious vulgar images in the guise of a real hyperlink? (There the “purpose” is to deceive.) The “purpose” of a link is to provide a link, obviously.
Did they not mean the “destination” of a link? If so, how is it not obvious from the semantics of the link? Isn’t it embedded right in the a href=""? How is it impossible to “determine” the destination of a link? That’s the user agent’s job, is it not?
Incidentally, there have been a few experimental all-sign-language sites in which many links and their targets are given completely in sign language. There is no link text per se. What’s between <a> and </a> is a video file or image of a person using sign language, and where you end up is another such video file or image (or a page full of those). Given the semantics of all markup systems in use on today’s Web, the hyperlink has to contain text characters in order to function. A still image has to contain an alt text (though alt="" is plausible in some cases). 
Nonetheless, there are a few scenarios in which a page intended to be accessible to sign-language speakers uses no text at all. How is such usage accommodated in WCAG 2? (And must authors, by implication, use an interpreter to voice the sign language for blind visitors, which must then of course be captioned or transcribed? Where does it end?)
I argued with the Working Group for months over the concept of semantics in markup, that is, the use of the correct element for the content. This argument betrayed the Group’s arrogance and its thorough incomptence at standards-compliant Web authoring. It also proved they’ve been asleep at the wheel for the last eight years, in which people like me have been labouring to improve Web standards. This nonsense alone is enough to generate suspicion and distrust among competent and up-to-date Web developers.
Nonetheless, now the word “semantics” is included, without elaboration or definition, in the Understanding and Techniques documents (whose examples I am condensing into one excerpt below). Occasionally, the term is recast as “structure.”
A simple text document is formatted with double blank lines before titles, asterisks to indicate list items and other standard formatting conventions so that its structure can be programmatically determined.
HTML Techniques for Marking Text [...] Using semantic markup to mark emphasized or special text
Making information and relationships conveyed through presentation programmatically determinable USING the technology-specific techniques below (for a technology in your baseline) [...] Using semantic elements to mark up structure [...] The semantics of some elements define whether or not their content is a meaningful sequence. For instance, in HTML, text is always a meaningful sequence. Tables and ordered lists are meaningful sequences, but unordered lists are not.
CSS Techniques [...] Positioning content based on structural markup
The WCAG main document does a drive-by and just barely avoids mentioning semantics by name:
[Content] includes the code and markup that define the structure, presentation, and interaction, as well as text, images, and sounds that convey information to the end-user.
This means your markup is also your content, which will come as a surprise to those who are interested in separation of content, structure, presentation, and behaviour. Here, “markup that define the structure, presentation, and interaction” clearly refers to semantics.
Some omissions immediately spring to mind. I have not done an exhaustive check for such omissions.
You are here: joeclark.org → Captioning and media access → 
Web accessibility → WCAG → Response to WCAG 2.0
Updated 2006.05.23