Now, if you will permit us to giddyup one of our hobbyhorses for a moment, we would like to place something in the record about the general hideousness of online captioning typography.
“Wh-wh-wha’?” you blubber, doing a triple-take. Yes, dear friends, while online video may be its own punchline (it’s still too small and jumpy five years into the game), you still need to make it accessible. Everyone pretty much assumes that means accessible to deaf and hard-of-hearing viewers, and we’ll humour those people for the time being. (You think blind people aren’t interested in online video? And isn’t online captioning most useful for people stuck in offices with crappy speakerless Windows clones – or trapped in an office where you don’t want the boss to hear the video you’re watching?)
Every available captioning method is crap. Indeed, every available method fails to be better than reality, if we may quote our favourite usability “super-expert.”
What is the reality, then? Crappy Line 21 TV closed captions and (generally) crappy DVD bitmaps. (World System Teletext captions in PAL-format countries are also pretty awful, but we don’t have adequate pictures to show you. There pretty much are no adequate pictures, an admission of the deplorable appearance of WST captions. Nobody wants the truth to be widely displayed.)
Line 21 captions
- Coarse dot matrix
- Often no descenders
- Almost invariably white on black
- Still usually all-uppercase
- Bitmaps (“subpictures”) with four colours and four transparency values
- Generally poor typeface selection by ignorami and non-experts using the antitypographic Windows platform
- Almost no use of custom-engineered screenfonts
- Background masks almost never used
- Usable for foreign-language subtitles (shown here) or same-language captions
Care to compare both at once? On Region 1 DVDs, you can.
Line 21 captions vs. DVD subtitles
- In DVD bitmaps, note tendency to adopt subtitling-like all-centred placement (skimpy and inadequate for deaf viewers) mated with Line 21–legacy typographic accommodations (like spaces inside brackets)
- Line 21 captions are uglier but always more readable despite bitmaps’ enormous available palette
Still no better than reality?
So how are things different online?
Well, start with the fact that the three dominant online video players – QuickTime, Real, and WiMP – all theoretically “support” captioning, but the exact meaning of “support” is critical here. All three players can display captions. The problem is getting the captions into the players.
- First, captions themselves are usually saved as text files. (You can use plain text. RealPlayer also understands its own proprietary format, RealText, while QuickTime also supports the proprietary QTtext format.)
- A second file controls presentation and timing of captions. Both QuickTime and RealPlayer let you use SMIL or Synchronized Media Integration Language, a markup language similar to HTML, to synchronize captions to online video.
- Windows Media Player is different. It uses a proprietary Microsoft format called SAMI (Synchronized Accessible Media Interchange) that combines caption text and presentation instructions in the same single file. But WiMP does not support all of SAMI even though both were invented by Microsoft!
- You can see, then, that there isn’t a single presentation file format that can be used in every player. SMIL is arguably the closest thing to a universal format, given that it functions with two players that themselves work on multiple computer platforms.
- While there are quite a few programs that can produce or export SMIL files (which, like HTML, are simple text files you can manually type out yourself in any text editor if you have to), the fact remains that you have to create two files – the captions themselves and a special file to synchronize captions.
All these players, after years of development, flub even rudimentary text-handling primitives like alignment. You can’t even right-align text reliably in all three players. You’ve been able to do that on TV in North America since 1985.
You can set various type parameters, like colour, font, and size, but since we’re almost always dealing with type 14 or fewer pixels high, results are not great even when you try to use oddball fonts.
It ain’t pretty
So what does the state of the art look like? Behold the widest collection of available samples. Don’t say we don’t work for yez here at NUblog. And, yeah, we’ll include subtitles here – because we’re feeling generous, and because the same technologies are used with the same typographic atrocities.
QuickTime (“Suction Cups” [.mov])
- Small picture
- Big words
- Still not enough pixels
QuickTime (Introduction to the Screen Reader)
- Unwise Geneva Bold type choice: Counters fill in, character shapes distort
- Good effort at speaker identification undone by italic pixelation
- As generally seen in online captioning, improper linebreaks (note full-measure first line and orphaned single word on second line)
- Sensibly chunky Charcoal font
- But too much text, and with poor copy-editing
- Too-small margins
QuickTime (“Volt: The Conversation/The Chase” English subtitles [fan page])
- Marathon line lengths
- Spindly Geneva type
- Bad copy-editing
- Somewhat inadequate translation
QuickTime (“Mango Blue” [.mov]; larger view):
- Bilingual options, but all-roman, chiefly-lowercase Geneva typography
- Excellent interface design
- Chiaroscuro video
QuickTime (DeafPlanet; very long larger version):
- Amusing, kooky kidz font (apparently Cocon) that remains adequately legible
- Yellow foreground colour works well
- Good positioning (if they can do it, so can you)
- Copy-editing typical of deaf English
- Single line is a limitation (some utterances here call for two lines)
- Good effort at alignment for speaker identification (falls down at shot change illustrated here)
- Poor use of displacement for repeated utterances
- Caption on/off controls almost never work
Flash (“Zoot Suit Riot”; larger view)
- Great potential, poor execution
- Selectable text that can’t stay in a single font from line to line
- Poor contrast
- Solid effort at typographic speaker identification
- Bad copy-editing, and unsupportable rewriting of original copy to serve corporate “branding” interests: “One dot-com company from Oxfordshire is convinced we’re accelerating towards an Internet car future. And as Sarah Campbell found out, they’ve got the wheels to prove it” becomes “Bunnyfoot is convinced ¶ we’re heading towards ¶ an Internet car future as ¶ Sarah Campbell found out”
- Hideous caption chunking
- Illustration shows composite of representative consecutive frames (very long larger version)
RealPlayer (CBC News decoded Line 21 captions):
- Slightly worse than Line 21 captions seen on a TV
- Burned in by caption decoder upstream of encoding station
- Can be added to any player format
RealPlayer (EEOC Spanish subtitles):
- Accents are shoehorned into poorly-designed screenfonts, impairing copyfitting and legibility
- No speaker identification
Anything we’re missing?
Undetectable in these still images are two factors crucial to comprehension:
- Blink rate: Number of blank frames between continuous captions. Can be as low as one under battle conditions. Two is the correct number, but nothing above that.
- Chunking: Division of full sentences into discrete blocks must take place at boundaries of sense: Each chunk has to make sense in and of itself. Sometimes impossible in long noun phrases.
These issues are generally flubbed in online captioning, and often flubbed in other media.
But we don’t want to write a how-to manual or anything.
Posted on 2002-05-20