<timeline>

The TEI and "time-aligned" text and sound

Copyright © John B. Lowe 1996

Most multimedia applications use sound recordings in three ways:

As a synchronized accompaniment to an animation or motion sequence
As a soundtrack for other text or graphic presentation ("mood-setting")
As snippets integrated into responses given by the application to users

In these uses the sound follows the text or presentation (i.e. the visual part of the presentation is primary, the sound is secondary). However, there is a set of applications in which sound occupies the central role. We can call this "hyperspeech," built on the analogous "hypertext." In this case the continous speech stream (or soundtrack) is annotated or punctuated by links to other entities. Perhaps the most common use of hyperspeech is time-aligned text, described (in part) as part of the <timeline> and <when> entities in chapter 11.3.2 of the TEI P3 standard. This standard allows the specification within a text of absolute or relative timepoints, permitting thereby the coindexing of the text with sound.

There is little software that supports such an application directly (correct me if I'm wrong!). That is, there is no off-the-shelf software implement the TEI <timeline> standard. Many technical issues surrounding the representation of the sound and the indexing still need to be addressed. In addtion, sounds, like graphics, require high bandwidths to transmit and to manipulate, creating new technical issues for WWW-based applications. (For an interesting approach to solving one dimension of this problem, see http://www.voyagerco.com/cdlink/.).

There are a wide variety of applications for such hyperspeech technology; I mention below a few that I happen to be aware of.

Sound archives: There are a number of projects worldwide which are directed towards preserving sounds originally recorded on fragile media such as magnetic tape and even wax cylinders. Once in a digital form, these sounds can be preserved without loss and distributed over networks.
[I am working in Paris on a project to archive recordings of speech (made by linguists in the course of field work) along with their (synchronized) text transcriptions. Analogue recordings will be digitized and stored (probably on CD) with time-aligned phonetic transcriptions, interlinear glossing, and translations.]
Pedagogy: Integration of "long-scale" sounds into teaching applications is becoming feasible. Language-learners, music students, and others can listen to annotated sound files at their own pace. These materials can be organized by teachers into challenging courseware.

Back to the Joe Clark main page, or to the the SGML for Access page.