Don’t show printouts to grannies and call that a test

Speaking notes from a presentation give 2007.09.12 at TypeTech, ATypI Brighton 2007.

Notes

I was at ATypI 2003 in Vancouver talking about caption fonts. Who can forget South Park?
I now run a research project, the Open & Closed Project, that is in the throes of birthing. We’re already working on screenfonts for captioning and subtitling.
- I’ve been having a devil of a time attracting designers. The entire history of typeface design has been about prospective work, typefaces designed because the designer wants to design them, and maybe selling them later. Only in recent years has it been possible to make a living off commissioned fonts.
- Here we’ve got an application where typefaces go for tens of thousands of dollars and would be read every day by thousands or millions of people, yet almost nobody wants to work prospectively on them.
I’ve been working on screenfonts for captioning off and on since 1989, when I gave a presentation at a conference on the typographic requirements for captioning for HDTV.

Why is the function of caption and subtitle fonts important?

The computer has made us understand that typography is functional. We came to understand that by looking at lousy screenfonts.
- We saw right through IBM VGA fonts in WordPerfect. We barely even noticed they were fonts!
- The IBM CGA fonts were terrible by comparison. We noticed that, at least.
- But we didn’t begin to care until the Mac came along with its aliased bitmap fonts.
We now know there are fields of typography where function is important, like signage and – still – on computer screens.
- Signage is unique in that it could be considered to have a life-and-death function.
- But computer screens are still mostly used for work. People put a lot of effort into making the type on office computer screens better.
So what about recreational reading?
- Books and magazines are mostly legible. So are newspapers. That’s a generalization, but you know it’s true most of the time.
- Where do people do a lot of recreational reading that isn’t in books, magazines, or newspapers? Surprise: On TV, on home video, and at the movies.

Today we’re going to look at just how pathetic the research into screenfonts for captioning and subtitling is, and come up with some principles for new research into those fonts.

Now we have to talk about the distinction between captioning and subtitling

First, let’s define our terms.

Captioning is the transcription of dialogue and sound effects for deaf viewers.
Subtitling is the translation of some dialogue and some onscreen type for hearing viewers who do not understand the main language.

(Further comparison.)

We all speak English here, and many of us grew up speaking it, but that doesn’t mean we all speak the same dialect. The same word doesn’t always mean the same thing to different people.

Sometimes we have more than one word for the same concept, like lift and elevator, or trunk of a car and boot of a car. That’s one case: Two different words for the same meaning.

Or we might use one word with two meanings. A good example is pants: Does it mean trousers or underwear? You can explain which meaning you intend by using a different word. Say the word pants, explain with the word trousers. Or explain with the word underwear.

But that is not what’s going on with the words captioning and subtitling. In U.K. and Irish English, subtitling simultaneously means captioning and subtitling. It isn’t like pants meaning trousers and underwear; that’s three words for two concepts. It’s subtitling meaning subtitling or captioning, one word for two concepts.

This is really important. Subtitling and captioning are two different things, and you can have both at once.

Subtitle at bottom, caption at top right

But in Irish and British English, you cannot differentiate the two things. This, my friends, is not a dialect difference but simply a mistake.

Captioning is captioning and subtitling is subtitling.

Reading off a screen

Americans and even Canadians mostly never see titles on screens, unless you’re deaf. Many other countries read titles frequently, but they’re almost all subtitles.
If you’re deaf or if you live in a subtitling country, you read thousands of words of captioning or subtitling or both every night.

If you watch TV two hours a night five nights a week for a year, you’ll probably read 4.4 million words just off a screen in that year.

Jensema (1996) surveyed reading speeds of many television genres and display forms (scrollup vs. pop-on). Working solely from his reported averages, in words per minute (wpm) and per hour (wph):

Scrollup: 151 wpm = 9,060 wph
Pop-on: 138 wpm = 8,280 wph

Assume 25% of your viewing is scrollup, 75% pop-on. Two hours a night, then, equals:

0.25 × 2 × 9,060 + 0.75 × 2 × 8,280 = 16,950 words/night

Over 5 nights a week for 52 weeks, you’re reading 4,407,000 words a year.

If we assumed all scrollup captioning, the figure is 4,711,200. With all pop-on captioning, it’s 4,305,600. So the order of magnitude is the same.

Even if you do the same calculation with the very slowest captioning reported by Jensema (74 wpm), simply an impossible scenario even for young children abandoned in front of the TV by their moms, the order of magnitude is still comparable – 2.3 million. (With the maximum, 231 wpm, it’s 7.2 million.)

But you’re doing all that reading for pleasure, so reading all those words has to not hurt.

Let’s walk through some of the technologies that are used to display captions and subtitles

(See my hundreds of caption and subtitle photos up on Flickr.)

Line 21

Line 21 is the captioning system in use in Canada, the U.S., and a few other countries that use the NTSC television format.
It’s called Line 21 because that’s the line of the TV picture where the caption codes are transmitted.
Your screen has 15 lines of 32 characters, nominally monospaced characters. You can use colour, but, for historical reasons, not many people do. We’ve got italics, which are rather interesting for reasons I could tell you about over coffee.
Line 21 captions can be recorded on anything – except MPEG – that can record a TV signal. We’ve had captioned home videos and laserdiscs for decades. DVDs in the NTSC format can include Line 21 captions.
The font is up to the decoder. Decoders are built into TV sets in the U.S. by law; in Canada we get the same TV sets.

Here are three of my old external Line 21 decoders. (They’ve all been superseded by decoder chips built into TV sets.)

Three machines in a stack: Small chrome, large chrome with red LED display, very large half-chrome/half-woodgrain with rotary dials

I was planning on taking some custom photos for this presentation, and I thought: What would be the most ridiculous possible thing to photograph?

Well, Sailor Moon, of course. But I ended up with some unusable photographs.

Then, to show you how things were like with the original decoder, after a lot of cursing and frustration I managed to get it hooked up and I got it decoding actual 21st-century TV programs.

Screenshot shows italic mixed-case text with no descenders and a coarse dot matrix (hard-to-read a and y)

Teletext

Teletext is the captioning system used on analogue television in PAL-format countries, like the U.K. and Ireland and Australia.
The signals are distributed among several television lines. And teletext is used for other purposes, like full pages of information. You have to tune into an actual channel of teletext to see captions.
You’ve got colour, but you don’t have italics.
Decoders are not built into TV sets by law, but most midrange and high-end TV sets have them. You cannot record teletext captions on a normal VHS tape, or on most other media, for that matter.

DVD

DVD has subpictures, which are bitmaps you can use for any purpose. You can tag certain streams of bitmaps as being captions or subtitles, which according to the spec and in reality are two different things.
PAL DVDs have to use subpictures for captioning. You can’t encode teletext captions on a DVD.
Fonts are really awful (a) because people don’t know what they’re doing and they use some total piece of shit like Arial and (b) because the pixels are not square or round. They’re all rectangular, but there are four different aspect ratios. It’s pretty hard to draw a curve with a rectangle.
By the way, if you’re wondering about HD-DVD or Blu-ray discs, don’t bother. They have some theoretically interesting caption and subtitle features, but neither format has really taken off.

HDTV

High-definition television in Canada and the U.S. uses the ATSC standard.
And under that spec you’ve got eight font categories –
1. “Default (undefined),” hence not really a font category
2. Monospaced serif (e.g., Courier, which is an unsuitable font for captioning)
3. Proportional serif (e.g., Times New Roman, also unsuitable)
4. Monospaced sansserif (e.g., Helvetica Monospaced [sic])
5. Proportional sansserif (e.g., Arial or Swiss [again sic])
6. Casual (“similar to Dom and Impress”)
7. Cursive (“similar to Coronet and Marigold”)
8. Small capitals (“similar to Engravers Gothic”), which is again not really a font category
The viewer can pick the font, the colour, the background colour, and other features.
The ATSC standard does not use Unicode and does not give you a wider canvas to work with. It’s really all about backward compatibility.

DVB

DVB is a digital television standard in the U.K. It uses bitmaps for captioning. We’ll be talking all about those in a while.

Online captions

Online captions can be closed captions. This is pretty stupid, because it replicates the broadcasting model. You don’t have only one channel online that has to serve everybody, deaf and hearing; you can upload as many video files as you want. Still, you can produce closed captions for all the main video formats, including Flash. They use the fonts on your computer.

Jeffrey Zeldman in a video image with caption below

Movies

There are lots of incompatible systems for captioning movies. The one in use in England is by DTS, and it involves projected open captions. They got the contract by flooding the country with “test sites” for free. The only movie I saw with DTS open captions used Arial.
The system most commonly used in Canada and the U.S. is Rear Window, which uses a big LED display at the back of the theatre. The display shows caption text in mirror image. You attach a translucent reflector to your seat and read the reflected captions. For several years there, I had seen more Rear Window–captioned movies than anyone else.

So those are the technologies we’re dealing with. And, typographically, they’re all crap!

Some general typography features:

Captions in all caps in the U.S. and Canada – still. The original caption decoders had no descenders on lower-case letters; all upper case was deemed less illegible. Though there is no reason to caption in upper case, it is still done.
No real quotation marks or dashes unless you use the extended character set. There aren’t any of those at all in teletext.
DVD subpictures are rather crap because you really cannot use antialiasing. You’ve got between one and four colours to work with, depending on how you read the spec, and that’s not enough.

The state of the art is a disaster

Let’s come back to the U.K. and talk about DVB, the standard for digital broadcasting.

The claimed solution for all captioning problems in the U.K. and Ireland is a typeface called Tiresias Screenfont.
It was designed in 1998 by John Gill of the RNIB, the Royal National Institute for the Blind, and by two alleged type designers, Chris Sharville and Pete O’Donnell. But mostly it was designed by Bitstream when you get right down to it.
- It’s one weight only, with no italics.
- And if you want a worldwide licence, you need to pony up $17,500.
John Gill specifically told me that a goal of the project was “legibility, not beauty.” Now, nobody here is gonna say you always get one with the other. But nobody is going to confirm what he’s insinuating – that you either get one or the other but not both.
Even though DVB captions are supposedly just bitmaps and you can put whatever you want in them and just transmit the bitmaps to the home viewer, Tiresias is built into set-top boxes. That means viewers are condemned to read this typeface, which, I emphasize, does not even have an italic.
Now, how did this all happen? Because of “research.” It’s comprised of two papers, neither of them peer-reviewed and only one of them still online.
The other one isn’t online anymore, and the online version never had any images, even after I reported that problem.
The printed version appeared in an obscure conference proceedings from Finland. For over two years, I’ve been asking various people, including John Gill, for a readable PDF with images or a hardcopy of the paper, and I’ve never gotten one. They’re intentionally concealing the research.
The research was an attempt to test Tiresias Screenfont against the existing shitty bitmap font from teletext and against other fonts.
The whole thing’s a disaster and it’s junk science from start to finish. It’s a textbook case of how not to test the performance of a typeface.
John Gill is a scientist at the Royal National Institute for the Blind and is really only interested in visually-impaired people even though captioning is for deaf people.
He tested 35 visually-impaired people and 48 hearing-impaired people. There were also 14 “controls,” who were mostly interpreters.
The average age for the visually-impaired group was 60.
- And the average age for the deaf group was 62, “although this average is skewed by five people under 21.”
- Both groups had test subjects over 90 years of age.
For the visually-impaired subjects, he printed out a sentence in 14-point type and showed it to the subjects on a closed-circuit monitor in three fonts – the teletext font, Tiresias, and Times.
- Yes, he tested screenfonts for captioning by showing a printout on a TV monitor.
- And yes, he tested Tiresias against Times and a font made out of a 7×5-dot bitmap. This is hardly a realistic test. Any text typeface you could name would look better than the bitmap font.
He only reports results from 26 of the 35 visually-impaired subjects. What happened to the other nine? Of the reported visually-impaired subjects, only 17 out of 26 preferred Tiresias. Eight people preferred Times, in fact.
The deaf subjects had to watch a fake video with fake captions that were so weird that he had a hard time getting people to pay attention to the font.
- What were the results with the deaf group? I don’t know; they weren’t reported. He does list their preferences for font size, but to say it again, a test of captioning fonts for deaf people refused to report the results from deaf subjects.
The whole project dismissed the expertise of type designers. In one of the papers, John Gill wrote that “fonts are normally designed by graphic artists who are primarily interested in æsthetic criteria.”
And on the basis of this “research,” the U.K. standardized on Tiresias for DVB captioning. The entire broadcasting industry in the U.K. and Ireland has been scammed by this junk science. Dr. John Gill should be ashamed of himself.

But it gets worse!

There are now two clones of Tiresias, Tioga and Mayberry.
- Tioga is by a designer who shall remain nameless at Agfa. He, she, or they shall remain nameless because he, she, or they have not been named. Perhaps somebody’s too embarrassed.
- Mayberry is by Steve Matteson from Ascender, who should really know better, because I’ve done work for Ascender. I voted Mayberry as the worst font of 2006 on Typographica.
- These clones are like the original PostScript Arial: They’re “metrically equivalent” to Tiresias, meaning each respective character has the same width. So if you can’t afford a shitty font that was never really tested, you can use a half-assed and shittier font that also was never tested. In fact, you’ve got a choice of two!
- The original Tiresias project was an outright insult to deaf people, but this is even worse. This is an affront to deaf people.

How is research going to fix this problem?

Well, it can, but first we’ll have to flip around the way screenfonts for captioning are developed.

These projects are not driven by type designers, and type designers aren’t picking the existing fonts to compare against. Researchers are running the show, and they can barely distinguish between Courier (“New”) and Times (“New Roman”).

Researchers always end up picking ridiculous candidate fonts that nobody with any type knowledge would ever use.

There’s already a model we can emulate. Let’s look at a book you should all buy: Vision and Art: The Biology of Seeing by Margaret Livingstone. She explains how the eye and the brain interpret visual images, and offers some ideas about how artists might have been subconsciously using those physiological phenomena in the creation of their art.

So yes, when you’re developing screenfonts you always look at the physiology and data from subjects, but you start with the actual knowledge of the artistic field, in this case type design. You start with that even if that knowledge isn’t written down or everybody you’re working with cannot see the differences that are as plain as day to you and me.

To develop screenfonts for captioning and subtitling, you start with the knowledge and intuitions of type designers and then you test. You don’t let a blindness researcher tell you how to design your font, then gin up some experiments that half-assedly “confirm” how good your font is. You never let a researcher tell you they want function and only function, not “beauty” or “æsthetics.” Because what they call beauty or æsthetics are the features that you know will influence function.

In other words, whatever they did with Tiresias you do the exact opposite.

Principles for designing and testing screenfonts for captions and subtitles

Use actual video. Genre is important. No sex or violence; it’s too distracting.
Choose viable candidate fonts. No Helvetica, Times, Courier, or Arial. Use your own expertise to pick them.
Test real caption features. Use real positioning, colour, italics.
Don’t fake a background mask. Get it exactly right.
Don’t test only signing deaf people. They aren’t the only audience. You have to test hard-of-hearing and hearing people, and in fact two groups of hearing people – native speakers of English, or whatever language, and ESL speakers too. In the U.S. and Canada, hearing people are the majority audience of captioning.
Test performance, not opinions. Test accuracy or retention of facts or eye motions. Do not ask what people think of the font, or, if you really have to, ask at the very end.
Test in all upper case and in mixed case. Captions that scream at you in capital letters are still common and will be for decades.
Test in all presentation modes. Line 21 has three presentation modes, but only two are widely used (pop-on and scrollup). Teletext has two.
Test in all pixel aspect ratios.
If you’re testing existing fonts, test default and wider tracking. Existing fonts are usually too tightly spaced.
Test the extreme cases, too, like cursive and casual fonts in HDTV, because if they’re available then people will use them.

There are some unresolved issues

Characters per unit time may be a better measurement than words per unit time, according to some other research into reading by persons with disabilities.
Exactly where eye fixations lie is unknown, particularly with all-centred captions or subtitles and very particularly with (abominable) centred-scrollup captions.
You cannot use a read-aloud protocol because many of your deaf or hard-of-hearing subjects cannot speak or do so intelligibly. Or if they can speak, it takes so much effort they are distracted from the task.
Communication in general is a problem. With many deaf subjects, you have to communicate visually in one way or another. In general terms, your subjects cannot look simultaneously at you, your interpreter, your note paper, or your real-time captioning and also the test stimulus.

Posted: 2007.09.18