Eye-gouging accessibility

Now, if you will permit us to giddyup one of our hobbyhorses for a moment, we would like to place something in the record about the general hideousness of online captioning typography.

“Wh-wh-wha’?” you blubber, doing a triple-take. Yes, dear friends, while online video may be its own punchline (it’s still too small and jumpy five years into the game), you still need to make it accessible. Everyone pretty much assumes that means accessible to deaf and hard-of-hearing viewers, and we’ll humour those people for the time being. (You think blind people aren’t interested in online video? And isn’t online captioning most useful for people stuck in offices with crappy speakerless Windows clones – or trapped in an office where you don’t want the boss to hear the video you’re watching?)

Every available captioning method is crap. Indeed, every available method fails to be better than reality, if we may quote our favourite usability “super-expert.”

What is the reality, then? Crappy Line 21 TV closed captions and (generally) crappy DVD bitmaps. (World System Teletext captions in PAL-format countries are also pretty awful, but we don’t have adequate pictures to show you. There pretty much are no adequate pictures, an admission of the deplorable appearance of WST captions. Nobody wants the truth to be widely displayed.)

Line 21 captions

Coarse dot matrix
Often no descenders
Almost invariably white on black
Monospacing
Still usually all-uppercase

DVD subtitles

Bitmaps (“subpictures”) with four colours and four transparency values
Generally poor typeface selection by ignorami and non-experts using the antitypographic Windows platform
Almost no use of custom-engineered screenfonts
Background masks almost never used
Usable for foreign-language subtitles (shown here) or same-language captions

Care to compare both at once? On Region 1 DVDs, you can.

Line 21 captions vs. DVD subtitles

In DVD bitmaps, note tendency to adopt subtitling-like all-centred placement (skimpy and inadequate for deaf viewers) mated with Line 21–legacy typographic accommodations (like spaces inside brackets)
Line 21 captions are uglier but always more readable despite bitmaps’ enormous available palette

Still no better than reality?

So how are things different online?

Well, start with the fact that the three dominant online video players – QuickTime, Real, and WiMP – all theoretically “support” captioning, but the exact meaning of “support” is critical here. All three players can display captions. The problem is getting the captions into the players.

First, captions themselves are usually saved as text files. (You can use plain text. RealPlayer also understands its own proprietary format, RealText, while QuickTime also supports the proprietary QTtext format.)
A second file controls presentation and timing of captions. Both QuickTime and RealPlayer let you use SMIL or Synchronized Media Integration Language, a markup language similar to HTML, to synchronize captions to online video.
Windows Media Player is different. It uses a proprietary Microsoft format called SAMI (Synchronized Accessible Media Interchange) that combines caption text and presentation instructions in the same single file. But WiMP does not support all of SAMI even though both were invented by Microsoft!
You can see, then, that there isn’t a single presentation file format that can be used in every player. SMIL is arguably the closest thing to a universal format, given that it functions with two players that themselves work on multiple computer platforms.
While there are quite a few programs that can produce or export SMIL files (which, like HTML, are simple text files you can manually type out yourself in any text editor if you have to), the fact remains that you have to create two files – the captions themselves and a special file to synchronize captions.

All these players, after years of development, flub even rudimentary text-handling primitives like alignment. You can’t even right-align text reliably in all three players. You’ve been able to do that on TV in North America since 1985.

You can set various type parameters, like colour, font, and size, but since we’re almost always dealing with type 14 or fewer pixels high, results are not great even when you try to use oddball fonts.

It ain’t pretty

So what does the state of the art look like? Behold the widest collection of available samples. Don’t say we don’t work for yez here at NUblog. And, yeah, we’ll include subtitles here – because we’re feeling generous, and because the same technologies are used with the same typographic atrocities.

QuickTime (“Suction Cups” [.mov])

Small picture
Big words
Still not enough pixels

QuickTime (Introduction to the Screen Reader)

Unwise Geneva Bold type choice: Counters fill in, character shapes distort
Good effort at speaker identification undone by italic pixelation
As generally seen in online captioning, improper linebreaks (note full-measure first line and orphaned single word on second line)

QuickTime (UVM)

Sensibly chunky Charcoal font
But too much text, and with poor copy-editing
Too-small margins

QuickTime (“Volt: The Conversation/The Chase” English subtitles [fan page])

Marathon line lengths
Spindly Geneva type
Bad copy-editing
Somewhat inadequate translation

QuickTime (“Mango Blue” [.mov]; larger view):

Bilingual options, but all-roman, chiefly-lowercase Geneva typography
Excellent interface design
Chiaroscuro video

QuickTime (DeafPlanet; very long larger version):

Amusing, kooky kidz font (apparently Cocon) that remains adequately legible
Yellow foreground colour works well
Good positioning (if they can do it, so can you)
Copy-editing typical of deaf English
Single line is a limitation (some utterances here call for two lines)
Good effort at alignment for speaker identification (falls down at shot change illustrated here)
Poor use of displacement for repeated utterances
Caption on/off controls almost never work

Flash (“Zoot Suit Riot”; larger view)

Great potential, poor execution
Selectable text that can’t stay in a single font from line to line
Poor contrast

RealPlayer (Bunnyfoot):

Solid effort at typographic speaker identification
Bad copy-editing, and unsupportable rewriting of original copy to serve corporate “branding” interests: “One dot-com company from Oxfordshire is convinced we’re accelerating towards an Internet car future. And as Sarah Campbell found out, they’ve got the wheels to prove it” becomes “Bunnyfoot is convinced ¶ we’re heading towards ¶ an Internet car future as ¶ Sarah Campbell found out”
Hideous caption chunking
Illustration shows composite of representative consecutive frames (very long larger version)

RealPlayer (CBC News decoded Line 21 captions):

Slightly worse than Line 21 captions seen on a TV
Burned in by caption decoder upstream of encoding station
Can be added to any player format

RealPlayer (EEOC Spanish subtitles):

Accents are shoehorned into poorly-designed screenfonts, impairing copyfitting and legibility
No speaker identification

Anything we’re missing?

Undetectable in these still images are two factors crucial to comprehension:

Blink rate: Number of blank frames between continuous captions. Can be as low as one under battle conditions. Two is the correct number, but nothing above that.
Chunking: Division of full sentences into discrete blocks must take place at boundaries of sense: Each chunk has to make sense in and of itself. Sometimes impossible in long noun phrases.

These issues are generally flubbed in online captioning, and often flubbed in other media.

But we don’t want to write a how-to manual or anything.

Eye-gouging accessibility

Still no better than reality?

It ain’t pretty

Anything we’re missing?

Links