Joe Clark: Media access

You are here: joeclark.orgCaptioning and media access
Comments on U.K. “guidelines” on “subtitling”

See also: Comments on U.K. “guidelines” on audio description

Updated 2001.07.15

Comments on U.K. “guidelines” on “subtitling”

On this page

BackgroundProsConsThe iffy parts


The Independent Television Commission (ITC), a regulatory body overseeing certain television channels in the United Kingdom, has produced guidelines for captioning (the U.K. term is “subtitling”), sign language, and audio description on British television.

The guidelines are available on the Web, but in a difficult format: Microsoft Word documents.

  • The main page, at the ITC Engineering section, that provides links to the documents is here.
  • The Microsoft Word document providing guidelines for captioning is here.
  • Two other guideline documents are available, for audio description (read my comments on those guidelines) and sign language.

A note about words: For unknown reasons, our dear British friends insist on using the word “subtitling” to mean “captioning” (titles in the same language as the audio). A “caption,” to the British, is any other kind of onscreen textual graphic, like the written name of a news announcer.

A “subtitle,” in British vernacular, can also apply to a title in a translated language. It is thus impossible to distinguish between captions and subtitles in the British argot: They’re both “subtitles.” In British English, it becomes possible to subtitle a subtitled program, and also possible to subtitle a captioned program, in any of a hundred conceivable languages.

Clear as mud, isn’t it?

This terminology is objectively inferior to what we use in Canada and needs to die a quick death. I’m not all that interested in encouraging this kind of confusion, so “subtitle” in the U.K. documents will always be changed to “caption” in these comments.

For reference, the preferred terminology is:

  • caption: Title in the same language as the audio
  • subtitle: Title in a translated language different from the audio
  • super or key: Any other informational textual graphic. (The more artistic versions would simply be called titles. Titles explaining who created and worked on a program are credits)

Research note: The Guidelines mention a couple of research studies:

  • Switched On,” a general survey of captioning viewers in the U.K. (Full version apparently costs £20)
  • “Dial 888: Subtitling for Deaf Children,” not available online


  • “Linebreaks within a word are especially disruptive to the reading process and should be avoided.”
    • Canadi-
      an twits have yet to learn this les-
      son, breaking words all over the place even though there’s more than enough space on the line to render them intact.
    • Do not add a hyphen to a word. Words that are always hyphenated can be broken at the hyphen, if necessary (left-handed), but a case could be made that a hyphenated proper name (Mary-Jane Haden-Guest) should not be broken at the hyphen.
    • In the exceptionally rare case that you must set a word that is wider than the measure (supercalifragilisticexpialidocious will not fit on a 32-character line), you have no choice but to hyphenate, but that will happen on one caption in five million.
  • The overlay technique:

    Subtle nuances of phrasing are difficult to deal with effectively, but special techniques can be used, for example:

    No... No... But that isn’t what I asked for.

    A more powerful effect is achieved by the “overlay” technique. This involves, for instance, turning the above example into two [captions], by first displaying No... and then adding the second part of the utterance after the pause and without deleting the No.... This dynamic method of simulating speech timing and phrasing can be very effective, but should be reserved for time and space emergencies because multiple overlays can result in jerky presentation and clogged screens.

    • It is now possible to use an overlay technique (let’s adopt that terminology) in Line 21 captioning. The overlay caption appears in the paint-on style. I’ve seen in three times, two of them in musicals, where one singer articulates a very long sentence (with a lot of held notes) and a second singer responds to individual words within the long phrase. Singer 1 gets a normal pop-on caption, while Singer 2’s lyrics are overlaid. Very nice.
    • Use where appropriate. That includes cases where two characters speak very close together, there isn’t enough time to show two separate captions, and the old way of rendering the utterances is to caption two blocks simultaneously onscreen. In this case, set the first block and overlay the second.
  • Music styles: Good advice here. Twit captioners don’t bother to tell you anything about the music playing in a scene. Canadians are particularly inept at this. Infuriatingly so, in fact. Our dear British confreres say:

    Provision of an occasional [caption] for mood music, if it is significant to the plot, can be very effective:


    Such [captions] should be used only sparingly. Occasionally, consecutive scenes are enacted in pitch darkness, and scene changes are signalled entirely by changes of incidental music. In such cases, if time permits, the [captioner] should use [captions] such as:


    Then, when the tempo of music changes dramatically, it is followed by:


    Thereby deaf viewers are made aware of the scene change.

  • Message to postproduction houses who think they are competent to caption so much as a used-car commercial: “Uncommon abbreviations, such as SFX, should be avoided.” We’re not writing a bloody shooting script here.
  • Typography: Apparently the Europeans have designed an entire font for digital television, Tiresias. It shares many characteristics with other engineered screenfonts, the details of which are extraneous here. The Guidelines state that “the Tiresias font shall be used for all [captions].” I suppose this is better than whatever 1980s bitmap holdover that World System Teletext currently uses. Some very sharp captioners might elect to use a different engineered screenfont and should be permitted to do so. What should be expressly forbidden is the use of any typeface ten years of age or older designed for print: No Univers, no Helvetica, and, for God’s sake, no accursed Arial or Times. (Faces like Thesis, Cæcilia, Serifa, and even Rotis could function adequately.)
  • On that same tip: “An italicised form of text may be used to indicate emphasis within a [caption].” Finally, the Europeans enter the 20th century and can use italics, which should automatically be used for all the purposes we use them over here: Every application in print typography (emphasis, titles of major artistic works, names of ships), in addition to offscreen voices, though there are many provisos there.
  • Music: In digital captioning, “The present practice of using # to indicate music shall be changed to use of two semiquavers as part of the Tiresias set.” Presently, song lyrics (and a few other applications) use the number sign or octothorpe, #. Once this recommendation comes into effect, we will have four different characters to surround song lyrics: The eighth note (“staffnote”) used in Line 21, the vertical bar | that represents it in captioning software, #, and “two semiquavers.” (What, by the way, is the difference? The Line 21 staffnote is an eighth note. A semiquaver is a sixteenth note – see picture. The Guidelines’ phrase “use of two semiquavers” presumably means “one at either end of the phrase” and not “two right up against each other at either end.”)


  • Chunking:
    • There’s ample linguistic research documenting the entirely obvious fact that sentences are composed of meaningful chunks. That, in fact, is the technical term: chunk. Clueless nitwit Canadian captioners have spent more than a decade ignoring the obvious, breaking captions after any word whatsoever, including the or I’d. Our British colleagues, to their credit, know better.

      [S]entences should be segmented at natural linguistic breaks such that each [caption] forms an integrated linguistic unit. Thus, segmentation at clause boundaries is to be preferred. For example:

      When I jumped on the bus...
      ...I saw the man who had taken the basket from the old lady.

    • However, the practice of ending captions in ellipses (...) must be stopped. It derives from old subtitling practice, and never made sense even there. It’s not as though we haven’t spent our entire lives reading lines of text that lead to further lines, pages that lead to further pages, screenfuls that lead to other screenfuls. If there’s no end punctuation, the sentence hasn’t ended. What kind of idiots do you take us for?
    • Further, ellipses add three precious characters to a very limited measure in which we can typeset caption text. Even if you believed ellipses were necessary, they aren’t necessary at the end of one caption and the beginning of the next. Frankly, it’s bad news all around. Stop using ellipses this way. And that applies to you, Captions, Inc., as much as the Brits.
    • Ellipses must retain their traditional use: To signify a pause or a deletion.
  • It gets worse. The Guidelines advise rewriting dialogue in an absurd, intrusive, condescending manner.

    It may be possible to break a long sentence into two or more separate sentences and to display them as consecutive [captions], e.g., “We have standing orders, and we have procedures which have been handed down to us over the centuries” becomes:

    We have standing orders
    and procedures.

    They have been handed down to us
    over the centuries.

    There seems to be a mad, irrational urge to avoid setting a caption that does not encompass an entire sentence. Get over it. Your job is not to rewrite dialogue; it is to render the dialogue.

  • “Difficult words”: Captioning for children is contentious and absolutely no one anywhere has it down pat. The Guidelines include a lengthy section on editing techniques for kids’ captioning. It is very difficult to make a convincing case for or (especially) against most of the recommendations, except for one:

    Difficult words should also be omitted rather than changed.

    • Dialogue: First thing we're going to do is make his big, ugly, bad-tempered head.
    • Simplified: First we're going to make his big, ugly head.

    As the Guidelines later mention, when the purpose of the segment is to introduce a new word, retain it and leave the caption onscreen longer to let kids read and understand it. But in the case above, “bad-tempered” carries more heft and spice than “big” and “ugly.” It’s the first thing you notice about the sentence, and not just because it’s a long word. (A shorter word like “shrivelled” would be just as prominent.) In this example, leave the caption verbatim, with longer reading time.

  • “Long speechless pauses in programmes can sometimes lead the viewer to wonder whether the teletext system has broken down. It can help in such cases to insert an explanatory caption such as:

    No, no, no. You cannot caption silence, unless it is dramatically significant – e.g., Krusty the Klown tells a joke and nobody so much as coughs, let alone laughs. (You’d still rely on a caption in that case only if you could not see the audience.) What you’re dealing with in that example isn’t a pause but a transition in music, hence:

    [Theme music plays]
    [Music ends]
    [Romantic music plays]

The iffy parts

  • “[Caption] appearance should coincide with speech onset. [Caption] disappearance should coincide roughly with the end of the corresponding speech segment, since [captions] remaining too long on the screen are likely to be re-read by the viewer... another kind of ‘false alarm.’ [...] For hard-of-hearing people viewing programmes which consist mainly of monologue, research has shown that perfect synchronisation is not an absolute necessity and delays of up to six seconds do not affect information retention.”
    • In reality, when there’s a pause of a few seconds without dialogue, you can populate the screen with captions from the preceding dialogue without a problem, unless the pause is very dramatically significant, in which case you have to keep the screen clear.
    • On rare occasions, you can eat up part of the dialogue-free segment and then leave a captionless pause. In any event, there should be no dictum to force onset and removal to the exact onset and ending of dialogue. A bit too strict, that.
  • “When consecutive [captions] have boxes of similar size and shape and the second directly over-writes the first, it is useful to position them slightly differently on the screen. This makes it easier for the viewer to perceive that the [caption] has changed.”
    • Yes. Similar-looking captions must be either displaced slightly (moved up or down one line, or over by a tab stop or one or two characters) or rewritten to show different linebreaks.
    • From what little knowledge I have, by default there are no blank frames between captions in World System Teletext captioning. Over here, the correct number of blank frames between captions is two (not four, as is widely and improperly done in Canada, although one frame, as seen on The Practice, captioned by Vitac, can be tolerable).
  • Colours:
    • The Europeans have pretty much always had colour captions, and their practices are well-developed, though whether they’re sensible or not is another story. Only one passage raises flags: “A blue background with white text can also be useful to indicate a different quality of voice such as a robot or ghost.”
      • The use of punctuation to stand in for complex auditory phenomena is always a bad idea. The use of upper- and lowercase type, for example, in otherwise-all-capitals text as a means of indicating words articulated over a loudspeaker (the Caption Center’s old way) is one example. Using ... or --- to indicate bleeped obscenities is another.
      • Gallaudet research showed that deaf viewers preferred a literal explanation of vocal quality along with a typographic change. In this example, we use uppercase for normal speech and notating nonverbal information, but upper- and lowercase for whispering:

        I can't talk to you here.

        Accordingly, using some kind of oddball colour combination for “robots” or what-have-you is ill-advised. Write it out, as other recommendations in the Guidelines already hold.

      • The Guidelines list a whole raft of cases of punctuation standing in for complex auditory phenomena: Single quotation marks for voiceovers, double quotes for “mechanically-reproduced speech,” “text in brackets” for “whispered speech or asides.” Ignore all these recommendations. Explain what is happening in words.
    • The Guidelines discuss but don’t really explain “veiling” (in digital television). The concept seems to involve covering an entire section of the screen, from left edge to right, in a translucent background. (Or foreground.) The British seem to do this a lot in news and information programs. Veiling is in contrast with the background bounding box of captions, which follows the shape of the captions and does not extend beyond that shape. Veiling seems like a simplistic solution imposed by dilettantes who object to titling on their precious television screens: “Well, if I have to put up with those words and we have to have some kind of contrast, I don’t want that chunky, jagged zigzag of text moving around my screen.” The chunky, jagged zigzag of text actually obscures the least underlying visual detail and is, quite simply, better.
  • The Guidelines actually recommend using flashing (or blinking) captions under certain unusual circumstances. They’re technically possible under Line 21 also, but if you set so much as a comma in blink mode no one is going to read the rest of the captions: All eyes will focus on the blinking text and stay there. And among other things, it’s tacky.
  • “Although of value for live subtitling, the use of a word-by-word display can create problems for the reader because of the speed of speech output and possible confusion in eye-movement. Its advantage, however, is in the provision of near-verbatim text.” This is a bit rich. Reading stenocaptioned text is merely a question of experience. Very large captions with very short lines, as seem to be the norm in England, make things harder, but we get that over here, too, in sports games and on financial channels, when live scrolling captions are sequestered into a tiny corner. What is it about British caption viewers that makes them less able to handle the kind of text we watch every day with no trouble whatsoever?
  • Punctuation: Stop the insanity. “One means of enhancing the effectiveness of punctuation is by the use of a single space before exclamation marks and question marks.” This simply is not done in English – but don’t tell Captions, Inc. that, since they are the largest practitioners of this semiliterate perversion. In some house styles of hot-metal typography (old Penguin paperbacks, for example), ? and ! were preceded by a thin space, equivalent to 1/8 em (1/8 the point size). In later phototypesetting, this was regularized to one point. (The same rules applied to French, for example.) Typing a full word space before a question mark or exclamation point is not done in this language.