AUTHOR’S NOTE – You’re reading the HTML version of a chapter from the book Building Accessible Websites (ISBN 0-7357-1150-X). Copyright © Joe Clark, 2002 (about the author). All rights reserved. ¶ Back to Contents
You now have world-class accessibility knowledge. You border on guru status. You’re just super.
The rest of the world, however, is not.
In so very many cases, the Web accessibility techniques I have presented in this book are retrofits, workarounds, kludges. Accessibility can be made practicable, yes. You know how to do that in spades right now. Yet it is difficult to make accessibility elegant. We will need to invent entire new technologies to reach that goal. And an entire new discipline of accessibility training will need to be brought into the world.
Over and over in this book, tediously, inescapably, we have faced the Sisyphean burden of rejiggering Websites designed for random access by the naked eye into some kind of half-arsed compatibility with devices that read sequentially. Quite simply, much of the content on Websites is made to be ignored most of the time, but it is impossible to ignore part of something when that something is presented to you in its entirety as a stream of computerized yammering from a screen reader.
We have been more or less successful in our efforts. With techniques like skipping navigation and moving from one well-labeled table cell to another, it is quite possible to read, understand, and enjoy a Web page with minimum fuss, even via screen reader or Braille.
But let’s be frank here. We may have achieved practical or passable accessibility, but what we have not achieved is elegance. What we have done, in effect, is to focus a camera at a stageplay and pretend the result is cinema.
How can we create Websites that look appropriate for nondisabled users of graphical browsers (containing within themselves basic accessibility features) and parallel Websites offering the same or analogous information that are optimized for screen-reader or Braille access?
Our challenge is simple to express but onerous to achieve: Shoving all navigation to the bottom of the document, leaving “content” and possibly search facilities up top. Which would you rather hear first?
Setting Braille aside, in effect we are faced with the dilemma of asking visual Web designers and Web programmers to create audio interfaces – a means of interacting with a computer system through sound. In practice, this really means sound output and not input, but that isn’t different in principle from the graphical Web: The Web may communicate with you through visual output, but you don’t communicate with the Web through visual input.
Some work has already been done on audio or auditory interfaces, chiefly by T.V. Raman, author of the book with the surprising title Auditory User Interfaces (Kluwer Academic Publishers, 1997). Raman formerly worked at Adobe, developing the PDF2txt and PDF2HTML utilities that convert PDFs into plain-text and HTML documents. (Look for them at access.adobe.com.) At time of writing, Raman is a researcher with IBM, working on the so-called multimodal Web. (We met Raman in Chapter 11, “Stylesheets.”)
Bill Buxton, formerly of the University of Toronto and now the chief scientist at Alias Wavefront, has promoted a range of common-sense ideas in human–computer interaction, such as the use of both hands and the sense of hearing, including stereo sound. But he’s just a piker compared to other researchers, who have explored auditory interfaces since the mid-1980s. Development has been pronounced in the field of earcons – “non-verbal audio messages that are used in the computer/user interface to provide information to the user about some computer object, operation or interaction,” according to Meera M. Blattner’s definition. Earcons are not sounds we would recognize from the real world (dog barking, typewriter bell, rustling of leaves), but arbitrary sound sequences, usually musical in nature. The typical proposed application is to alert the computer user to an ongoing process (like downloading a file), though some experiments have been conducted on the use of earcons in user interfaces. Imagine your computer playing a musical note as you pull down a menu. That sort of thing. (Citations are provided in the Bibliography.)
Web-specific audio interfaces have also been studied, with promising results. On the vexing problem of navigation of data tables, researchers at the University of Glasgow added tones whose pitch increased as you moved down and/or across a table. The addition made the task of finding the answer to a question from deep within the numbers in a table significantly faster, easier, and more pleasant. Meanwhile, the University of Hertfordshire developed a prototype system using earcons and other non-speech sounds to navigate through “hypermedia information” (don’t you love the academic terminology?); in effect, the researchers wrote their own screen reader. Most subjects found it easy to navigate through a purpose-built, self-contained “hypermedia” virtual world using voice and sound cues. (The virtual world included descriptions of paintings and artworks. “Participants were very pleased with the tactile pictures and accompanying descriptions; for example, they were excited to find out what a Beefeater, the Houses of Parliament, and famous paintings looked like,” researchers reported. Don’t underestimate the blind population’s thirst for information about the visual world.)
Meanwhile, one screen-reader manufacturer, Alva Access Group, is cooking up a putatively new approach to reading the screen, one that simplifies commands and uses “multispatial” sound and voice.
All very futuristic.
Unfortunately, none of this research and development is going to matter in the real world without standardization and its kissing cousin, ubiquity.
Let us cast our thoughts back to the dark prehistory of HTML 4, or “oldschool HTML” as I have so charmingly dubbed it. It was impossible to find a browser, any browser anywhere, that supported every single accessibility feature. (Possible exception: The World Wide Web Consortium’s testbed application Amaya, a stillborn Edsel of a browser/editor beloved by an Übergeek cadre of W3C policy wonks.) Even as I write this there is no browser at all that supports every nook and cranny of accessible HTML, not even Mozilla.
Now, though, we are contemplating opening the experience of Web browsing to an almost-entirely-new dimension, sound. It’s as big a jump as leaping from oldschool HTML to Java applets. Bigger, even. It’s almost as significant as jumping from Usenet and E-mail to the Web itself. For once in our blessed lives, the debased catchphrase “paradigm shift” is genuinely applicable.
Auditory interfaces would have to manifest themselves at the level of the screen reader or, better yet, the browser. The kinds of auditory metadata and cues that researchers have laid out for us would have to become as commonplace as visual-interface metadata and cues, like tooltips for
titles or a shift in cursor shape when you run the mouse over a link. And their specific meaning would have to be as widely understood as left-hand navigation, tabs, and underlined links.
What’s more, developers today do not make use of the access tags that are already supported. Still, to this day, we find images without so much as an
alt text, a laughably simple, widely-supported, and actually obligatory access attribute. Now we’re imagining the creation of a parallel auditory interface to accompany a Website’s visual interface.
At present, it is difficult even to specify the fonts you prefer in certain HTML tags. Something as conceptually simple as relative font sizing in cascading stylesheets (from
xx-large) is improperly implemented in visual browsers. And now we want people – developers or “end users” or both – to be able to select little sounds and tones that relate to Website constructs, like links, text, input fields, and navigation? I don’t think so.
There’s another parallel: Skins. The rampaging popularity of WinAmp, with its selectable appearances and interfaces (“skins”), suddenly made us wonder why we were putting up with software designers’ deciding for us how their programs would look all these years. Apple implemented and then crippled Themes and Appearances in the Mac OS, for a plausible reason: It is sobering to imagine tech-support staff having to explain how to perform action X when you’ve turned your entire interface into Teletubbyland. Windows XP and Mozilla support skins. Is it time for auditory skins, with sounds you yourself select to indicate Website landmarks?
A feeling of déjà vu will surely sweep over you, esteemed reader, as you think back to the vexing dilemma of making Flash animations and online video accessible. Even if we had perfect tools for that purpose, we are nonetheless faced with the prospect of training creators in the artistic development of access formats like audio description. In the present case, we somehow expect visual Web designers, whose all-too-frequent crimes against usability are merely the most prominent of their many venial sins, to master a new means of expression that they themselves will never use by choice.
Somehow the use of auditory cues will have to become instantly widespread for it to reach a “tipping point” (pace Malcolm Gladwell) and become standardized.
Static Web design at least follows in the footsteps of print design; multimedia stands on the shoulders of cinema and static Web design. But what are the antecedents of auditory interfaces? They were all born of research projects. There are no real-world examples to rely on. This vicious cycle in and of itself will smother the deployment of audio interfaces.
In the catechism of Web authoring, HTML corresponds to structure and stylesheets to presentation. The distinction is not always clear-cut, and I am about to propose a muddying of the waters.
We should be able to mark up sections of our documents with standardized tags describing the structure and purpose of those sections. Note that section-level structure is largely unconsidered in HTML. You could argue that any content below a heading tag (up to but not including another heading tag) is a conceptual section, as is any kind of list. Certainly a frame is a section, as is a free-standing table inside a page.
All right, fine. So we do have sections after all. Fine.
But do we have sections called, for example, TopNav, LeftNav, BottomNav, AnchorNav (for navigation purely within the page), Searchbox, Body, or Sidebar (as in newspaper sites)? How about sections with names suited to auction sites? help-wanted sites? search engines?
Now, if we had a standardized vocabulary of section elements (including foreign-language translations where applicable – the French should be able to use
navgauche instead of
LeftNav), wouldn’t we be able to set up our pages for re-manipulation out in the field?
If you wrap your navbars in coding like –
– then couldn’t a really intelligent browser or device happen along and reorder those segments for screen-reader display as follows?
You would never have to bother maintaining alternate page structures. With your document properly marked up, anyone who requires an alternate can rejigger the page at their end, unbeknownst to you and without affecting anyone else’s experience of the site. (It’s the closed-access philosophy, à la captioning and audio description: Build accessibility into original source and let people turn the features on and off as desired. Do so even if the result is so very different from the inaccessible original that many who don’t need accessibility would never tolerate being stuck with it permanently.)
Standardized grammars are not without precedent. Dublin Core metadata (dublincore.org) are a standardized means of adding information about a page within its
<head></head> element. In theory, compliant devices can sort and reorder such documents based on the metadata contents. If, for example, 20 of the 300 pages at your site deal with a specific topic (e.g., a topic for which a Library of Congress subject heading exists, like “Closed captioning” or “Video recordings for the visually handicapped”), a device could skim through all 300 pages and present a hitlist of only the documents matching that subject heading. By strictly limiting the range of permissible classifications, it becomes possible to engage in pinpoint manipulations – a triage much more precise than you’d get from a search engine chugging through the words in all your pages, to continue with the preceding example. (Among other things, there are fewer items to look at if you concern yourself only with Dublin Core metadata. It’s easier to examine 300
<head></head> elements than 300 full texts.)
Now, of course, as with anything smart and promising and of true practical utility in the wide world of the Web, Dublin Core metadata are essentially unused. (But Flash? Flash everybody is using!) There is little incentive to invent devices that can manipulate such metadata if no one is using it. The same fate may befall section tags.
Or perhaps the use of section tags will eventually become widespread. There’s already a World Wide Web Consortium group working on what are called Composite Capabilities/Preferences Profiles (CCPP; see CCPP.org), which will definitely encompass the kind of accessibility reconfigurations discussed here. But like all voluntary standards, CCPP faces an uphill battle. Frankly, it is much more expedient to hack a few
alt texts into a Web page and call it accessible (an approach this book does not merely authorize but actively teaches you to do) than to reprogram a site so that it becomes elegantly accessible.
A very big, very necessary new invention can be summed up in a very few words: An authoring tool that maintains accessibility assets in a database for manipulation. Assets in this category include:
longdescassociated with images, so that when an image is used or reused in a site or is shared with other sites, its prewritten text analogues automatically appear. (You need write such analogues only once per language.) Multilingual sites must keep track of multiple-language variants. Sliced images must be maintained as a unit, with the full panoply of text equivalents.
tabindex, either fixed or variable, according to page and site requirements.
abbreviations maintained sitewide (tricky, since the same character strings used on different pages might or might not actually be acronyms or abbreviations).
“Once accessible, always available” should be the philosophy here.
Now, all this is readily achieved in a costly content-management system, and I can envisage Broadvision or Vignette marketing a $50,000 module that takes care of these tasks. I can also imagine shareware applications using MySQL or other free tools. I am having a hard time imagining Macromedia or Adobe ever getting around to adding these functions to Dreamweaver or GoLive. Call me a pessimist.
On the other hand, the blossoming of homegrown content-management tools – Blogger, Greymatter, and Movable Type are the Big Three among so-called independent Websites – gives me cause for optimism. If anyone’s going to implement intelligent database structures and reconfigurable site structures along the lines of CCPP, these dedicated hobbyists are; the diamond-studded content-management systems won’t get around to it until there’s cash on the barrelhead paying for such development. Open-source content managers like Zope are another glimmer of hope.
As discussed in Chapter 14, “Certification and testing,” the degree of accessibility of a Website is not cut-and-dried; it’s all relative. Fixing access problems at a Website is, at present, rather inconvenient. It is quite possible to quickly check for the absence of text analogues on images through search and replace, but subtler issues, like the use of
abbr or navigation, require careful thought and subjective evaluation. Then again, even a “quick” check of images ceases to be quick if you’re stuck doing it for dozens or hundreds of pages or templates.
Our authoring tools should take a good run at automating this process for us. We could imagine a system that, working in tandem with the asset database, presented a custom-made screen of allegedly inaccessible components. Imagine a list of all the images on your site that do not come equipped with
alt texts laid out before you, with type-in fields alongside. Enter an
alt once and it propagates automatically not merely through that page, but everywhere on your site (and on any other sites whose files reside on a server you specify). If a graphic with a single function has two appearances in two different languages, the authoring tool is smart enough to show them to you together so you can make the text equivalent; you can also enter language-specific text (e.g., for an invariant shopping-cart icon whose
alt must differ on English- and French-language E-commerce sites).
Such a tool could display a thumbnail of your page, onto which you could drag and drop (or add via keyboard commands) a set of
tabindex values. (We can’t rely solely on drag and drop. The authoring tool itself must be accessible to a disabled person. In this case, one could tab from element to element and take action through keyboard commands where necessary.)
A very intelligent repair tool would spot typographic attributes used to make text look like a heading (e.g.,
red) and suggest a stylesheet attached to an actual
<hx> heading tag to take its place.
Unlike the nag known as Bobby, repair tools of this calibre would let you select the complexity for which you wish to test. Don’t feel like adding
titles to all your images? Deselect that option. Don’t particularly care about acronyms? No one says you have to, necessarily. Yet the tool would be smart enough to actually require what is required: You could not opt to skip an
alt text, for example.
We do have something along these lines: A-Prompt from the University of Toronto (APrompt.ca), a Windows application that walks you through the entire range of accessibility issues on a site. It lacks the database function, random access, and pseudointelligence of what I have in mind, but it gets us more than halfway to our goal. (It’s included on this book’s CD-ROM.)
Eventually, every midsized Web developer, whether in-house or in a consultancy, will require automated accessibility repair, because their clients will require repair services. Ultimately, though, building access into original designs will become so commonplace that the need to retrofit accessibility will diminish. That fact in itself may discourage developers from inventing the repair tool envisioned here.
Online accessibility requires subjective interpretation. It’s hard enough summing up still images in words. The Big Four access techniques for multimedia – captioning, audio description, subtitling, and dubbing – are exacting and ill-understood, and demand a range of skills that Web designers and programmers do not immediately have.
Do you, in all honesty, have anything resembling the skills necessary to carry out this kind of work?
Probably not. But can you learn it? Probably so. Even the tough stuff.
But where are you going to learn it? There’s barely any training at all for conventional Web design these days. Bookshelves groaning under the weight of how-to books like this one do not constitute training, and spotty college and night courses here and there don’t cut the mustard, either. Yes, Web design is a new field, meaning that everyone who’s any good at it is doing it rather than teaching it. But that just proves the point: We’re working in a medium of mass communication populated largely by autodidacts.
So it may be idealistic to imagine training for the subjective aspects of accessibility. Or maybe it isn’t: Maybe the same companies that caused the problem of inaccessible imagery and multimedia in the first place should join forces to fund the development of training courses and materials that will teach any halfway intelligent person to do halfway intelligent access work. One could imagine a multilingual off-the-shelf product, containing DVDs, CDs, and print materials, training Web designers and programmers in the Big Four access techniques, with conventional Web access, as documented in this book, also explained.
Such a training program would be a worldwide first. Are you aware that there is no standardized training at all for the existing industries of captioning, audio description, subtitling, and dubbing? (This explains why quality and technique vary so appallingly in those fields.) Adding the responsibility of adapting those techniques to multimedia is a lot to ask. But we need it anyway, no matter how difficult the task may appear.
This is a particular concern of mine, and by the time this book hits those groaning bookshelves there may already be news on that front. But at present, the only training that’s widely available forms part of this book, which is quite insufficient even by my own admission.
This chapter is entitled “Future dreams,” so let’s do some dreaming. What we need are:
There. I have thrown down the gauntlet.
How will we turn these future dreams into present-day reality?