You are here: joeclark.org → Captioning and media access → DVD accessibility → Finer points in DVD accessibility capabilities
See also: Basic DVD capabilities
Updated 2002.10.01
[Skip to contents]
Not a lot is known about the accessibility capabilities of DVDs, at least by home DVD viewers. This page attempts to explain some of the finer points of technical capabilities of DVDs when it comes to captioning, subtitling, audio description, and dubbing.
Now, have you read the easy explanation of capabilities yet? Have a look at that first.
In a later posting, we’ll use all this information to debunk a few myths about DVD accessibility, chief among them “I don’t have room on my disc for extraneous stuff like that.”
Despite spending hours poring over books and trying to understand these topics well enough to explain them simply, I’m conceding right off the bat that this explanation is, in fact, too difficult and opaque. I ask readers to write in to tell me exactly which details, no matter how small, they think need to be explained better.
I still need definitive answers for quite a few questions here, marked with [OPEN QUESTION]. Can you help? Send E-mail.
Short of joining the DVD Forum and buying the mysterious, costly, and no-doubt-impenetrable Books, one has no real choice but to rely on the two leading books on DVD production:
(Jim Taylor lists a couple of other books in his massive DVD FAQ.)
The bit budget is the total number of bits available for use on a DVD. The exact figure varies from DVD format to format, but DVD-Video discs can nominally carry from 1.46 to 17.08 billion bytes of data, depending on the following [Taylor, 184]:
[W]hat usually happens is that the content is compressed to fit the capacity of the disc. If the movie is 110 minutes long and has two audio tracks, the video bit budget can be set at a much higher 4.9 Mbps to achieve better quality. Or a 2 ½-hour movie might be compressed slightly more than usual if the disc producer determines that the video quality is acceptable. [...]
The maximum video data rate of DVD is limited to 9.8 Mbps by the DVD-Video specification. The maximum combined rate of video, audio, and subtitles is limited to 10.08 Mbps [Taylor, 177–178].
Another feature you might not have known existed: The DVD author can specify that certain features are in fact captions or audio descriptions, and can specify the language they appear in, among other capacities.
How is that useful? Your DVD player, if it pays attention to those things, can automatically play, for example, English “subtitles” or English “audio for visually impaired” if it is present on a disc. You the viewer don’t have to do anything.
There is an interaction here: The DVD has to be coded properly, so that audio and subtitle tracks have their language and other status correctly assigned (e.g., English audio, French audio, English subtitles, Spanish subtitles), and your player has to allow you to request all those settings. The system can fall down at any link in the chain.
My Hitachi DV-C605U is set to play English audio (normal audio, not “audio for visually impaired”) and English “subtitles” automatically. In fact, it does not. I have to select subtitles manually. There is no way to tell if the player is faulty or if DVD authors have simply failed to indicate that certain audio and subtitle tracks are stored in those languages. (One would have to inspect the files used to create the DVD.)
So: Let’s look at the content settings first (as opposed to player settings).
Taylor [265]: A video object set (“one or more video objects... A video object is part or all of an MPEG-2 program stream”) can carry a “code extension” indicating it is “caption” or “for visually impaired,” among other settings. Confusingly, Taylor lists those options in a section entitled “VOBS Audio Attributes.” (A VOBS is a video object set.) However, captions are quite obviously not audible. (Jim Taylor writes: “Don't blame me, it’s in the spec. I can only assume they mean an ‘audio caption’ other than for audio for visually impaired (DVS). Maybe a narration track?”)
Taylor goes on to say [266] that subpictures (typically used for captions and subtitles) can carry the following code extensions. (I added the bracketed letters; see below.)
Now, the problem here is that the term “caption” seems to be used to mean three different things. I think the items marked [S] above are actually subtitles, [C] items are captions, and [K] items are keys or overlays of text (like the name of a reporter or a city that you would see in a TV news report). Oh, but hold that thought! We may have it cleared up shortly.
So those are the content settings. How can you set up your player? Taylor [281] says all the following are possible, though players may not actually implement them:
Note the terminology differences. Jim Taylor writes: “This confusion is partly my fault. I ‘translated’ the descriptions from the spec in the player settings section, since I had the same reaction as you ([S] vs. [C] vs. [K]), but I neglected to apply the same translation in the content settings section. They are the same.”
It seems that only the “forced subtitles” setting seems unclear. I suppose one could use this feature for a film in, say, English, with extended segments in Russian and Ukrainian that are subtitled in the original theatrical production. (That’s an actual example – look at Sum of All Fears, in which the subtitles actually move around to signify speakers, proving that subtitles can, in fact, act just like captions and audiences will not undergo cardiac arrest.)
The DVD version of such a film could force English subtitles just for those sections whether English subtitles had been otherwise chosen or not. Since language settings can interact with audio and subpicture settings, it would also be possible to force French subtitles to appear during the French-dubbed version.
This, at least, is my interpretation.
This still leaves the question of why the DVD spec allows the owner of a player to set forced captions to appear automatically. Forced captions will always appear because the author tagged them as forced. By definition, viewer choice is not part of the picture. (Jim Taylor writes: “The same set of attribute values are used for content and player. Presumably a player would never actually let you set the ‘forced subtitles’ as your preferred format.”)
How much data is streaming through your DVD player at any given time? We’ll limit the discussion to subpictures and audio tracks. (Line 21 closed captions use about 120 bytes per second – next to nothing.)
First, a little-understood fact. When your DVD-Video is playing, all the audio tracks and all the subpicture tracks are being processed simultaneously! That’s how you can switch from one to another instantaneously. (The DVD author can turn that feature off, annoyingly enough. That’s done mostly if there’s such a difference in the dynamic range of, say, stereo audio vs. DTS audio that viewers who aren’t audio engineers – and that’s most of us – would think there’s something wrong with the audio levels. I don’t see any reason at all to disallow “user operation” of the subtitle feature, but some authors do it.)
Taylor [252] states that the “typical” stream data rate for a subpicture track is 10 kbps. Much depends on the content of the subpictures, of course.
LaBarge [75] lists the following typical bitrates for audio tracks. Here, 1 kbps = 1,000 bps or 1,000 bits per second.
Now, for audio description, it would be unusual to employ anything greater than Dolby Digital Surround. Dolby Digital Stereo would be the norm, and I think all my DVDs with audio description use that format for the description track.
It is thus possible to add a description track yet ensure that it eats up the smallest possible chunk of data on the disc. Theoretically, the only Dolby Stereo track on a commercial disc could be the description track; everything else could be higher-quality.
A case could be made that the description track should use the same format as the main audio in that language. Equal access, equal rights, equal dignity, that sort of thing. But the primary language of the audio almost always gets highest priority. It’s even difficult to find equivalent audio formats in English and French or English and Spanish; the non-English track tends to be of lower quality. Since we’re talking about a secondary English track, it’s gonna be tough sledding to persuade authors to eat up bits purely for parity or equivalency with the audio meant for sighted people.
You can completely replace the subpicture bitmap with each frame or field. Remember, you can put anything you want in that bitmap, including an entire set of caption or subtitle lines.
In practice, you’d never use that feature, because there isn’t a human being alive who can read a caption that’s displayed for 1/30 of a second or less.
NTSC DVDs – the DVDs we have in North America – can carry conventional Line 21 closed captions. All DVDs can carry subtitles, which can be put to the same use as captions. The data format used for the subtitles is known as subpictures. They are nearly-full-screen bitmaps, meaning you can put anything you want in them.
Subpictures are very compact. Bitmaps can only be so big in the first place, especially when they’re as simple as as these. Reducing size even further is run-length encoding.
It’s a system of compression where repeated elements are represented in a code like “Show this many copies of this item” rather than encoding that many actual copies of the actual item. As Jim Taylor explains it, “rather than storing, say, 100 red dots, you store one red dot and a count of 100” [Taylor, 91].
DVD-Video uses run-length compression for subpictures, which contain captions and simple graphic overlays. The legibility of subtitles is critical, so it is important that no detail be lost. DVD limits subpictures to four colours at a time, so there are lots of repeating runs of colours, making them perfect candidates for run-length compression. Compressed subpicture data makes up less than 0.5% of a typical DVD-Video program [Taylor, 91, emphasis added].
Subpictures, then, take up next to no space on a DVD.
Why aren’t Line 21 closed captions automatically included on a DVD whenever you convert a captioned videotape to DVD? Why are PAL-format World System Teletext captions not included?
Taylor [91] puts it simply: “Video recording starts at line 23” in an NTSC picture (but where in a PAL picture? [OPEN QUESTION]). That neatly cuts out Line 21 and every line above it, which, incidentally, is where World System Teletext codes reside, presuming similar behaviour is true in PAL encoding.
So if there are no lines above 23 on a DVD, how can NTSC DVDs be closed-captioned? Because the closed-captioning data is stored elsewhere on the DVD and regenerated upon playback. Some players (not very many, but some) cannot manage this at all; my own player is somewhat finicky, choosing on some days not to transmit captions at all.
So how much of the screen can you eat up with subpictures? Nearly the whole thing: In NTSC, a 720 × 478 block, and in PAL, 720 × 573 [Taylor, 310].
The full picture size in NTSC is 720 × 480 pixels and, in PAL, 720 × 576. So you lose two pixels of width in NTSC and three in PAL. I suppose those missing pixels could be discernible if you had a hot white background and you used jet-black subpictures, but we live with these things. (Jim Taylor adds: “And you had a studio monitor with underscan. Since consumer TVs have overscan [edges of picture are not shown] then 99.9% of viewers would never miss the missing pixels.”)
Why is colour so poor in DVD subpictures? Usually, the only captions and subtitles we see are single-colour.
Well, it seems that there are no individual pixels in subpictures. Instead, you’ve got four pixels stacked on top of each other that may be displayed in different combinations. “The four pixel types are officially designated as foreground (also called pattern), background, emphasis-1, and emphasis-2, but are not strictly tied to these functions.... Each pixel type is associated with one colour from a palette of 16 and one contrast or transparency level.... The transparency is set directly, from invisible (0), through 14 levels of transparency (1–14), to opaque (15)” [Taylor, 310].
The background colour is typically set to transparent, and foreground tends to be white or yellow. The emphasis-1 and emphasis-2 colours would typically not be used for captioning or subtitling. But here’s a related fact: The same subpictures that are used for captioning are also used to create simple menu buttons. You’ve probably seen two kinds of emphasis in menu buttons already – the first kind when you place the cursor on the button and the second when you select it.
It seems, then, that only the foreground and background colours are typically used for captioning and subtitiling.
“The colour and contrast of the four pixel types can be changed for each frame or field; the palettes can be changed for each PGC” [Taylor, 311]. (We’ll explain PGC shortly.) In other words, you’ve got 16 colours to play with for each of the four pixel types. In practice, you can make subtitles or captions in one of 16 colours.
You can pick and choose colours from that set of 16 with every field or frame. (Video signals are made up of frames, or still images. There are roughly 30 frames per second in North American television. A frame, though, is subdivided into two fields.)
You can change the entire set of 16 colours from which pixel colours are chosen with each PGC.
All right. What’s a PGC? A program chain. “DVD-V content is broken into titles (movies or albums), and parts of titles (chapters or songs). Titles are made up of cells linked together by one or more program chains (PGC)” (DVD FAQ). For a typical movie DVD, then, a PGC is a string of chapters. You can change the entire set of 16 possible colours with each chain of chapters.
If your disc contains widescreen and full-screen version of the same movie, or if you can buy entirely different discs with their own aspect ratios (the ratio of picture width to height), well, surprise! you can specify different caption or subtitle tracks for each aspect ratio.
It’s actually quite necessary to differentiate aspect ratios in captioning because the positions of actors in a movie change in widescreen compared to pan-and-scan full-screen presentations. You’ve also got more room to play with: A typical widescreen movie is letterboxed, and you can jam a great deal of caption text into that black bar below the picture without causing “distraction.” Compared to what you can get away with by superimposing text over live video, you’ve got loads of room. In fact, if three-line captions are deemed undesirable in a full-screen movie (some people deem them so; I don’t), you can certainly use them in the widescreen version since two of the lines will be outside the visible picture!
Anyway, LaBarge [164] says: “Subpicture streams can have aspect-ratio parameters associated with them to tell the DVD-Video player if the stream should be used in normal, letterbox, or widescreen display modes.”
Now, how many captioners actually produce captions in different aspect ratios? I know one that claims to do it, but I guess it all boils down to client requests and budgets.
Basic DVD capabilities ¶ DVDs with audio description ¶ Listings: Region 1, Region 2 →