Joe Clark: joeclark.org (E-mail)

You are here: Homepage > Media access > SGML for access > SGML for captioning


SGML for captioning

SGML can be applied to the practice of captioning film and video. SGML encodes structure, not overt form. But overt form is what most captioning viewers – and commentators – are familiar with.

Let's take an example and think of the uses of italics in American captioning. Here italics are the overt format, and the list below itemizes some of the underlying structure or function represented by italics.

There are others, and this is not an invitation to spend the rest of our lives itemizing them. However, the point here is that SGML for access will need to encode the function and then let the interpreting program decide on format. For example:

<sound effect>phone rings</sound effect>

could take the overt form of

( phone rings )

at the Caption Center (apologies for the use of the <I> tag) or

[PHONE RINGS]

at NCI or

[ Phone Rings ]

at Captions, Inc.

If we encoded the caption file structurally, we could very easily transfer information among captioners. (Yes, overseas, too.) That includes the most common transformation – syndication reformats or anything that simply requires a global offset in timecode, or an offset after every commercial break.

<offset 0:01:13.26>
<offset 0>

Or have the machine do a first pass at calculating new timecodes for an NTSC-to-PAL transfer:

<fps original=30 new=25>

where fps = frames per second. This could save NCI lots of time in reformatting Line 21 CC movies for UK Line 22 CC, not that NCI is overburdened with Line 22 business or all that interested in open standardization.

Also, information could be formatted for different kinds of displays, viz.:

simply by selecting that output device as the desired one – all from the same easily-encoded data.

Get the idea?