Opening up accessibility

One of the many losing battles fought in the trenches of the Web is accessibility. Usually the word refers to making Web sites usable and understandable to people with disabilities, which mostly means the blind and visually-impaired. Even that limited definition of access has netted next to no tangible accessibility work on the part of Web developers, many of whom still can’t be bothered to include meaningful ALT tags for images. Still. In the year 2000. Despite being required in HTML 4.0.

Accessibility is actually a broader concept. Media access encompasses not only those with sensory disabilities (blindness and deafness), but, to a limited extent, people with certain cognitive disabilities (like learning disabilities or autism). In fact, a more catholic understanding of online access boils down to maximizing your audience. Some examples:

Access for blind and visually-impaired visitors

Coding for meaning and structure (COLGROUP, STRONG, ADDRESS)
Textual equivalents (ALT, LONGDESC; previous discussion)
Audio description: Narration, read out loud by a human being (or, in the future, by voice synthesis), that succinctly explains visual details not apparent from the audio alone. Often misnamed “video description” (more than video can be described; the technique started in live theatre) or Descriptive Video (a service mark of WGBH’s Descriptive Video Service). Next to no online resources; some dry background research here

Access for the deaf and hard-of-hearing

Captioning: Rendering of speech and other audible information in the written language of the audio. (See Gary Robson’s FAQ and contenu.nu-related resources.) Usually closed: Captions are encoded or invisible and must be decoded or made visible. Some captions are open and can’t be turned off.

Access for those who don’t understand your language

Subtitling, not the same as captioning. (The British use the term subtitling to refer to both. We hope to change that.) Despite their seeming similarity, captioning and subtitling have very little in common. How?
1. Captions are intended for deaf and hard-of-hearing audiences. The assumed audience for subtitling is hearing people who do not understand the language of dialogue.
2. Captions move to denote who is speaking; subtitles are almost always set at bottom centre.
3. Captions can explicitly state the speaker’s name:
  1. Cigarette Smoking Man:
  2. [MARTIN]
  3. >> Announcer:
4. Captions notate sound effects and other dramatically significant audio. Subtitles assume you can hear the phone ringing, the footsteps outside the door, or a thunderclap.
5. Subtitles are always open. Captions are usually closed (i.e., require a decoder, usually built into television sets, or a seat-mounted display, as in the WGBH Rear Window system).
6. Captions are in the same language as the audio. Subtitles are a translation.
7. Subtitles also translate onscreen type in another language, e.g., a sign tacked to a door, a computer monitor display, a newspaper headline, or opening credits.
8. Subtitles never mention the source language. A film with dialogue in multiple languages will feature continuous subtitles that never indicate that the source language has changed. (Or only dialogue in one language will be subtitled – Cf. Life Is Beautiful, where only the Italian is subtitled, not the German.)
9. Captions tend to render the language of dialogue, transliterate the dialogue, or state the language:
  1. JE VOUS EN PRIE, MONSIEUR.
  2. OGENKI DESU KA?
  3. [SPEAKING RUSSIAN]
10. Captions ideally render all utterances. Subtitles do not bother to duplicate some verbal forms, e.g., proper names uttered in isolation (“Jacques!”), words repeated (“Help! Help! Help!”), song lyrics, phrases or utterances in the target language, or phrases the worldly hearing audience is expected to know (“Danke schön”).
11. Captions render tone and manner of voice where necessary:
  1. ( whispering )
  2. [BRITISH ACCENT]
  3. [ Vincent, Narrating ]
12. A subtitled program can be captioned (subtitles first, captions later). Captioned programs aren’t subtitled after captioning.
Dubbing: Replacing vocal tracks with vocal tracks in another language. Dubbed programs can be and are captioned
Localization: Altering software or computer documents so they will be accepted as natural-seeming and fluent in another language

Browser compatibility

Yes, friends, making your site work in Netscape is an accessibility practice.

Word to the wise: Don’t think for a second that any of these techniques, apart from certain HTML attributes, is remotely simple or straightforward. Captioning, audio description, subtitling, and dubbing are spectacularly difficult and nuanced arts that, by and large, are mishandled. (We go back 20 years in media access. We know what we’re talking about.) Even with currently rampant quality problems, those Big Four access techniques should be left to qualified experts. In general, you shouldn’t try this at home, kids.

Multimedia access

We are not big fans of video on the Web (first discussion; the Convergence Myth). Welcome to reality: It’s happening anyway.

The National Center for Accessible Media at WGBH, Microsoft, and the World Wide Web Consortium have put in a great deal of well-meaning effort in accessibility for Web multimedia.

It ain’t going nowhere. Despite the existence of free caption- and description-creating tools like MAGpie (Windows only), nobody’s in a rush to bother adding captions and descriptions to multimedia clips. It’s too complicated, intrinsically (nuanced arts, remember?), technically, and conceptually. And any approach that requires a special version of player software on the viewer’s end (e.g., QuickTime Pro per se) is doomed.

The goal all along has been to produce a system similar to closed access on television – you activate display of captions decoded by a chip or external set-top device, or you turn on the Second Audio Program channel to hear audio descriptions. The goal in multimedia access, then, is to cybercast one video stream. You could then activate captions or descriptions, or both, or neither, as you liked. Sounds a lot like a TV signal, doesn’t it?

And on how many occasions has TV translated well to the Web?

This approach rests on premises that, while valid in the world of television, were misapplied to the online milieu or have been superseded.

On TV, there’s exactly one feed for PBS or CBC or Showcase. (Well, technically there may be a handful of dayparted feeds for different timezones. In practice, there are an extremely limited number of program streams and broadcast times, as selected by the network.) Online, any number of feeds are conceivable, and access can be granted at any second of any day according to the viewer’s wishes.
On TV, spectrum is limited; while broadcast and cable TV divide up the spectrum differently, spectrum is discrete and allocated by government license. There is no channel 13.7 on your TV dial, for example. Online, there are hardwired limits, but they are not related to regulatory decrees. You could, in fact, imagine ten channels between 13 and 14, all dedicated to PBS or CBC or Showcase. (You could imagine channels without numbers.)
An advantage of the single-network approach of television is the limited number of head ends: You need only so many tape decks and transmitters. Online, every streaming video request requires marginal increased overhead. But that problem has been largely solved by Akamai and other server-redistribution technologies. You can now stream as many signals as you want, within reason.

Multimedia access: The new, improved way

OK. Everything we’ve done so far hasn’t worked. Here’s what will.

Produce captions, descriptions, subtitles, and dubs for your video snippet. Go out-of-house: You aren’t going to have the expertise or equipment in your facility.
Digitize separate streams with open versions of those features – captions or descriptions or subtitles (or whatever) that you cannot turn off. In some cases, you can combine feeds: A subtitled film should also be captioned; a feed with captions and descriptions is useful to access freaks and kids with learning disabilities.
Use Akamai or equivalent technology to serve multiple streams at once from the same startpage. Set up a form along these lines (in this case, for a movie trailer):
View the Wuthering Heights trailer with any or all of the following
- Captions: Verbatim – Normal – Easy
- Audio descriptions
- Subtitles: Français – Español – Deutsch

Everybody’s happy. No special player, no special authoring software.

This way, we serve the same goals as closed access – sensitive nondisabled English-speakers never have to be bothered by extra words onscreen or in audio – while opening up access to new markets using existing multimedia players.

What’s not to like?

How are we gonna pay for all this?

You’re already wondering how to pay for this sort of thing. People always do. You’ll drop any amount of money on a launch party (or a schmooze party for the film festival, or a news helicopter), and you’d never imagine attempting to charge for main audio or video. But the minute anyone suggests accessibility, all of a sudden everybody’s poor.

So here’s how you pay for it. You save the verbatim transcription of the program. You also save the script produced for the audio description of your program. If you’re smart, you buy two sets of descriptions: continuous, where the descriptions proceed through the entire program with few, if any, pauses, and interlude, the normal kind, with descriptions mostly limited to pauses in dialogue.

Combine those scripts in a way that is entirely understandable and flows well if read continuously, rather like a film script, and voilà, you’ve just produced a searchable, indexable text-only analogue of your audiovisual program. You can suddenly do the following:

Archive the text at your site, letting people search, browse, and read it. (How else do you index huge swaths of video? You index the text and link to the video.) Not a moneymaker directly, but all of a sudden people enjoy increased usability at your site, meaning more traffic, meaning... whatever higher traffic means on sites these days. (A tenth of a cent higher CPM, we guess.)
Sell the Unified Transcript™ (our trademark) online, at Contentville or moral equivalent and via Lexis/Nexis or moral equivalent. U.S. television networks already do something similar, but their transcripts are of poor quality and do not represent visual elements at all.
License the foreign-language text versions to sites working in that language. (Cheaper than starting up a portal in that language, isn’t it? And it pays.)
License the audio-described version to MP3 sites. Put out a CD and an audiocassette for motorists. (Makes no sense? Well, what’s gonna work to get the word out for your new 40-minute indie film? Getting people with narrowband connections to hear a version of your movie in their cars or annoying them with your choppy online video? Do you want their experience to be partial and enjoyable – no visuals but well-crafted audio – or partial and frustrating?)

You don’t think you can earn back a few grand in accessibility costs this way?

We think this All Access™ approach (another of our trademarks) is a smart way to proceed with, say, the digitization of television and film archives. And in any event, our approach is absolutely the only one that’s going to work in a world of online film distribution. It’s widely accepted that DVD will be the last hardware distribution medium for films. Online connections, eventually, sometime, at long last, will provide video-on-demand. And English-language Hollywood films with no access features are a lousy investment, because only a few hundred million viewers can actually understand the content, compared to many more hundreds of millions for variants of that content with access features.

Inevitably, some selection mechanism will be developed to enable users to pick and choose their desired access provisions. There are ample precedents, including closed captioning and description on TV and the menus encoded into typical DVD movies.

Moreover, a byproduct of captioning, audio description, subtitling, and dubbing is the creation of a network of text, audio, and video variations that themselves have value and can command a price.

Accessibility is firmly rooted in a moral and indeed social responsibility, a concept very much out of favour today. But frankly, it also makes business sense.

We reiterate: What’s not to like?

Posted on 2000-08-27

Opening up accessibility

Multimedia access

Multimedia access: The new, improved way

View the Wuthering Heights trailer with any or all of the following

How are we gonna pay for all this?