‘I can’t read what you’re saying’

Presentation made at the 37th annual Association Typographique Internationale (ATypI) conference, Vancouver, 2003.09.27

by Joe Clark

Understanding captioning and subtitling


Rendering of speech and other audible information in the written language of the audio
Rendering a translation of dialogue and certain onscreen elements in visible words


  1. Captions are intended for deaf and hard-of-hearing audiences. The assumed audience for subtitling is hearing people who do not understand the language of dialogue.
  2. Captions move to denote who is speaking; subtitles are almost always set at bottom centre.
  3. Captions can explicitly state the speaker’s name.
  4. Captions notate sound effects and other dramatically significant audio (non-speech information). Subtitles assume you can hear the phone ringing, the footsteps outside the door, or thunder rumbling.
  5. Captions render tone and manner of voice where necessary.
  6. Subtitles are usually open: You always see them. Captions are usually closed; you have to turn them on or opt into them.
  7. Captions are usually in the same language as the audio. Subtitles are usually a translation.
  8. Subtitles also translate onscreen type, like a sign tacked to a door, a computer monitor display, a newspaper headline, or opening or closing credits.
  9. Subtitles do not mention the source language. A film with dialogue in multiple languages will feature continuous subtitles that never indicate that the source language has changed.
  10. Captions tend to render the language of dialogue, transliterate the dialogue, or tell you what language is being spoken.
  11. Captions ideally render all utterances.
  12. A subtitled program can be captioned (subtitles first, captions later). Captioned programs aren’t subtitled after captioning.

State of the art

Captioning typography is tied to captioning technology.

Line 21
  1. Captioning in NTSC countries (Canada, U.S., Japan)
  2. The spec is called EIA-608
  3. “Line 21” comes from the line of the television picture where captions are transmitted
  4. Three generations of Line 21 fonts, all of them terrible
  5. Original fonts had no descenders. That’s why captioning to this day is usually in uppercase, which was deemed less illegible
    • I don’t need to list the many ways in which this is undesirable
  6. Black backgrounds. White foreground by default. Red, blue, green, yellow, cyan, and magenta possible foreground colours, which nobody uses
  7. You get italics and underlining – and blinking. Turning those on or off creates a visible space. Same with changing colours
  8. Full-screen addressing; 32-column rows
  9. Monospaced fonts in every case I’ve ever seen
  10. Decoders are built into most TV sets by U.S. law; 25 million are sold each year
  11. Decoder maker has choice of font; one font per set
World System Teletext
  1. PAL countries (Europe, Australia); SÉCAM in France has an equivalent system
  2. No italic or underlining
  3. Several colours
  4. Full-screen addressing; 36-column rows
  5. Monospaced
  6. Decoders built into many midrange and high-end sets
  7. Decoder maker has choice of font; one font per set; often the same invariant font from the dawn of teletext
  8. Some set-top boxes can decode teletext, as in Australia
DVB: Digital video broadcast
  1. Used for digital terrestrial broadcasting in the U.K.
  2. Uses teletext files simply re-rendered in bitmap fonts
  1. Every DVD can contain up to 32 tracks of subpictures, which can be used for captions, subtitles, or anything else a bitmap can be used for
  2. Colours are horrible
    1. You get foreground, background, and two emphasis colours
    2. Each pixel gets to be one colour from a palette of 16, with up to 14 transparency levels
  3. Should theoretically permit antialiasing, but never does in practice
  4. The default spacing that authors use is terrible. Spacing is very important in screenfonts, as we learned from Verdana and Georgia. Letters are too close together
  5. NTSC discs can also carry Line 21 captions; teletext can never be carried
  1. U.S. and Canadian high-definition TV specification is called EIA-708
  2. Eight font families are specified:
    1. Proportionally spaced without serifs
    2. Monospaced without serifs
    3. Proportionally spaced with serifs
    4. Monospaced with serifs
    5. Casual font type
    6. Cursive font type
    7. Small capitals
    8. Default (undefined)
  3. An accident waiting to happen, since foundries are selling off-the-shelf fonts for print typography
  4. There are next to no native HDTV captions being produced; everything is “upconverted” from 608
  1. Limited number of subtitling and captioning methods for first-run movies
    1. Burn off the emulsion, resulting in monoline or stroked fonts
    2. Optical reprint using an internegative
  2. Offline editing, as Avid or Final Cut Pro, can impose captions or subtitles now
  1. Closed-captioning system for first-run movies
  2. Large LED panel on the back wall of the theatre that displays captions in mirror-image
  3. On the way in, you pick up a Plexiglas reflector, sit it in your cupholder, and read the reflected captions while watching the movie
  4. Also a competing system of projected bitmaps that I’ve never seen, but sounds so terrible I barely want to think about it


  1. Nearly all the people who have specified fonts so far have been incompetent
  2. They’ve also refused to learn over the decades
  3. There are no suitable fonts in existence


Captioning is different from other forms of typography, even subtitling, as follows:

  1. Instant comprehension: You snooze, you lose. If you miss the caption, you never get to read it again. And since you are, presumably, deaf, you have no way of figuring out what they said. (In theory, today you can rewind the tape or disc and rewatch it, but nobody makes two sets of captions, one for instant comprehension and one for rewinding and rewatching.)
  2. Low resolution, usually monospaced. Even film subtitles aren’t high-resolution most of the time
  3. There’s no testing of captioning fonts.
  4. People creating and using captioning fonts have no taste.

The most important constraint

The nature of the fonts leads to errors and outright lies. If you can’t fit as many words into the caption as the person actually said, or if you can’t fit them in a way that’s readable, then you have to edit the caption. Editing leads to misrepresenting the source leads to lying to the audience – all because your fonts didn’t let you write down exactly what they said.

Existing fonts

We really don’t have any custom-made captioning and subtitling fonts. One or two have been used over the years.

Examples: Tiresias

  1. Old BBC slabserif: The Department of Typography at the University of Reading created a slabserif font for BBC subtitling (not captioning, subtitling; don’t get those confused) back in the 1980s
  2. Tiresias, everyone’s savior
    1. John Gill or RNIB and Bitstream
    2. Many variations for signage, only one for TV: Tiresias Screenfont
    3. Amateur design stewardship
      1. No italics. John Gill said in an interview: “USA subtitling requires italics and bold which are not used in European subtitling.”
      2. Ill-finished and actually confusable letterforms.
  3. Or people use homegrown fonts or “whatever’s handy,” including Arial.
  4. Typography is poor
    1. Windows-style neutral quotation marks and dashes
    2. Not much in the way of alignment or positioning
      1. Improper influence of subtitling on captioning. People seem to think it’s confusing to move captions around the screen to show who’s talking; in reality, it’s confusing not to

Online captioning

So things are pretty terrible in caption typography.

But now we have “multimedia.”

You ain’t seen nothin’ yet.


There almost is no online captioning. It barely exists.


There are two forms:

  1. Retransmitted TV captioning
  2. Player-specific captioning
  1. It’s just decoded TV closed captions
  1. Three main players: QuickTime, RealPlayer, and Windows Media
  2. Some minority players exist
  3. You have to add captions as a separate file that is synchronized to the original movie
  4. Those separate files come in two formats, SMIL and SAMI
    • Synchronized Multimedia Integration Language, a World Wide Web Consortium standard
    • Synchronized Accessible Media Interchange
  5. It’s virtually impossible in practice. The software is terrible and virtually no one has expertise
  6. Fonts can be specified, but it depends on what’s installed on the viewer’s computer, of course
  7. Typically offscreen display
    1. Occasionally discussed in “real” captioning – I talked about it back in 1989 for HDTV – but it is in some ways antithetical to the entire practice of captioning
    2. You can add captions to a subtitled production, but they’re next to impossible to follow
    3. Essentially, positioning disappears. You can’t even reliably achieve left, centre, and right justification if you set up files for all three players
  8. Typically white-on-black
  9. System-level antialiasing is not known to apply to media players on OS X; ClearType should work by default in media players on Windows
  10. Italics are awful
  11. Blink rate (number of no-caption frames between frames with captions) is zero, making captions hard to read
  1. Virtually impossible to caption, but it’s improving
  1. You can embed multimedia, complete with captions, subtitles, dubbing, or audio descriptions, in Acrobat 6
  2. Controls there are my idea

What we need

  1. Less reliance on player-specific captioning. Retransmitted or burned-in captions are better
  2. Font smoothing on by default in players even if turned off elsewhere
  3. Way frigging better fonts
  4. Training

Posted: 2003.10.04