Timed Text Markup Language 2 (TTML2)

W3C Recommendation 08 November 2018

This version:

https://www.w3.org/TR/2018/REC-ttml2-20181108/

Latest version:

https://www.w3.org/TR/ttml2/

Previous version:

https://www.w3.org/TR/2018/PR-ttml2-20181004/

Editors:

Glenn Adams, Skynav

Cyril Concolato, Netflix

Contributing Authors:

Glenn Adams, Skynav

Cyril Concolato, Netflix

Mike Dolan, Invited Expert

Sean Hayes, Microsoft

Frans de Jong, European Broadcasting Union

Dae Kim, Netflix

Pierre-Anthony Lemieux, MovieLabs

Nigel Megitt, British Broadcasting Corporation

Dave Singer, Apple Computer

Jerry Smith, Microsoft

Andreas Tai, Institut für Rundfunktechnik GmbH

Please refer to the errata for this document, which may include normative corrections.

Abstract

This document specifies the Timed Text Markup Language (TTML), Version 2, also known as TTML2, in terms of a vocabulary and semantics thereof.

The Timed Text Markup Language is a content type that represents timed text media for the purpose of interchange among authoring systems. Timed text is textual information that is intrinsically or extrinsically associated with timing information.

It is intended to be used for the purpose of transcoding or exchanging timed text information among legacy distribution content formats presently in use for subtitling and captioning functions.

In addition to being used for interchange among legacy distribution content formats, TTML Content may be used directly as a distribution format, for example, providing a standard content format to reference from a element in an [HTML 5.2] document, or a or media element in a [SMIL 3.0] document.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This is the Timed Text Markup Language 2 (TTML2) W3C Recommendation, produced by the Timed Text (TT) Working Group as part of the W3C Video in the Web Activity, following the procedures set out for the W3C Process. The authors of this document are listed in the header of this document.

An implementation report demonstrates that the specification is implementable. A cumulative summary of all changes from TTML1, 3rd Edition, [TTML1] is available at Timed Text Markup Language 2 (TTML2) Change Summary. An abbreviated list of changes affecting language syntax can be found at U Changes to Vocabulary from TTML1.

Comments about this document are welcome by filing an issue on GitHub or sending email to [email protected] (subscribe, archives) with a subject line starting with [ttml2].

This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

1 Introduction

Unless specified otherwise, this section and its sub-sections are non-normative.

The Timed Text Markup Language (TTML), Version 2, also referred to as TTML2, provides a standardized representation of a particular subset of textual information with which stylistic, layout, and timing semantics are associated by an author or an authoring system for the purpose of interchange and processing.

TTML is expressly designed to meet only a limited set of requirements established by [TTAF1-REQ], and summarized in M Requirements. In particular, only those requirements which service the need of performing interchange with existing, legacy distribution systems are satisfied.

In addition to being used for interchange among legacy distribution content formats, TTML Content may be used directly as a distribution format, providing, for example, a standard content format to reference from a element in an [HTML 5.2] document, or a or media element in a [SMIL 3.0] document. Certain properties of TTML support streamability of content, as described in R Streaming TTML Content.

Note:

While TTML is not expressly designed for direct (embedded) integration into an HTML or a SMIL document instance, such integration is not precluded.

Note:

In some contexts of use, it may be appropriate to employ animated content to depict sign language representations of the same content as expressed by a Timed Text document instance. This use case is not explicitly addressed by TTML mechanisms, but may be addressed by some external multimedia integration technology, such as SMIL.

Note:

In previous drafts of this specification, TTML was referred to as DFXP (Distribution Format Exchange Profile). This latter term is retained for historical reasons in certain contexts, such as profile names and designators.

1.1 System Model

Use of TTML is intended to function in a wider context of Timed Text Authoring, Transcoding, Distribution and Presentation mechanisms that are based upon the system model depicted in Figure 1 – System Model, wherein the Timed Text Markup Language serves as a bidirectional interchange format among a heterogeneous collection of authoring systems, and as a unidirectional interchange format to a heterogeneous collection of distribution formats after undergoing transcoding or compilation to the target distribution formats as required, and where one particular distribution format is a TTML Content Document.

Two classes of processor are described. Authoring systems and validation processors are examples of Transformation Processors; transcoding systems and rendering processors are examples of Presentation Processors. A TTML Profile Document can be associated with a TTML Content Document or a processor, to allow each to express those features that are available, prohibited or required. Collectively this allows the constraints of the chain from authoring to presentation to be expressed in a formal language.

Processors can implement the defined mapping to TTML Intermediate Documents. The system model depicts one such rendering processor that further maps those documents into HTML and CSS fragments that could be inserted into an [HTML 5.2] document for display by a user agent.

Figure 1 – System Model

1.2 Document Example

A TTML document instance consists of a tt document element that contains a header and a body, where the header specifies document level metadata, styling definitions and layout definitions, and the body specifies text content intermixed with references to style and layout information and inline styling and timing information.

Example Fragment – TTML Document Structure

Document level metadata may specify a document title, description, and copyright information. In addition, arbitrary metadata drawn from other namespaces may be specified.

Example Fragment – TTML Metadata
Timed Text TTML Example The Authors (c) 2006

Styling information may be specified in the form of style specification definitions that are referenced by layout and content information, specified inline with content information, or both.

In Example Fragment – TTML Styling, four style sets of specifications are defined, with one set serving as a collection of default styles.

Example Fragment – TTML Styling

If a style element appears as a descendant of a region element, then the style element must be ignored for the purpose of computing referential styles as defined by 10.4.1.2 Referential Styling and 10.4.1.3 Chained Referential Styling.

Note:

That is to say, when referential styling is used by an element to refer to a style element, then the referenced style element must appear as a descendant of the styling element, and not in any other context.

Note:

If a condition attribute applies to a style element and that condition evaluates to false, then its nested and inline styles are ignored; however, styles that would be included by means of referential or chained referential styling are not ignored. See 10.4.4.2 Specified Style Set Processing for further details.

10.1.3 styling

The styling element is a container element used to group styling matter, including metadata that applies to styling matter.

The styling element accepts as its children zero or more elements in the Metadata.class element group, followed by zero or more initial elements, followed by zero or more style elements.

XML Representation – Element Information Item: styling
xml:base = xml:id = ID xml:lang = xsd:string xml:space = ("default" \| "preserve") Content: Metadata.class, initial, style*

To the extent that time semantics apply to the content of the styling element, the implied time interval of this element is defined to be coterminous with the root temporal extent.

10.2 Styling Attribute Vocabulary

This section defines the 10.2.1 style attribute used with certain animation elements, content elements, certain layout elements, and style definition elements.

In addition, this section specifies the following attributes in the TT Style Namespace for use with style definition elements, certain layout elements, and content elements that support inline style specifications:

In addition to the above visual styling attribues, this section specifies the following audio styling attributes in the TT Audio Style Namespace for use with style definition elements and content elements that support inline style specifications:

Each style attribute (and corresponding property) defined in this section makes use of a style property definition table, which specifies one or more of the following aspects of the style: value syntax, initial value, elements to which style semantically applies, whether style is inherited or not, how percentage values are interpreted (if applicable), whether (and how) style is animatable, and semantic basis (derivation).

For animatable styles, the term discrete refers to the use of either a set element or an animate element with the discrete value for its . The term continuous refers to the use of an animate element with a linear, paced, or spline value for its . The term none indicates the style is not animatable.

Note:

This specification makes use of lowerCamelCased local names for style attributes that are based upon like-named properties defined by [XSL-FO 1.1]. This convention is likewise extended to token values of such properties.

Note:

An inheritable style property may be expressed as a specified attribute on a region element or on a content element type independently of whether the property applies to that element type. This capability permits the expression of an inheritable style property on ancestor elements to which the property does not apply.

Note:

Due to the general syntax of this specification (and the schemas it references) with respect to how style attributes are specified, particularly for the purpose of supporting inheritance, it is possible for an author to inadvertently specify a non-inheritable style attribute on an element that applies neither to that element or any of its descendants while still remaining conformant from a content validity perspective. Content authors may wish to make use of TTML content verification tools that detect and warn about such usage.

10.2.1 style

The style attribute is used by referential style association to reference one or more style elements each of which define a style (property) set.

The style attribute may be specified by an instance of the following element types:

If specified, the value of a style attribute must adhere to the IDREFS data type defined by [XML Schema Part 2], §3.3.10, and, furthermore, each IDREF must reference a style element which has a styling element as an ancestor.

If the same IDREF, ID₁, appears more than one time in the value of a style attribute, then there should be an intervening IDREF, ID₂, where ID₂ is not equal to ID₁.

Note:

This constraint is intended to discourage the use of redundant referential styling while still allowing the same style to be referenced multiple times in order to potentially override prior referenced styles, e.g., when an intervening, distinct style is referenced in the IDREFS list.

Note:

See the specific element type definitions that permit use of the style attribute, as well as 10.4.1.2 Referential Styling and 10.4.1.3 Chained Referential Styling, for further information on its semantics.

10.2.2 tts:backgroundClip

The tts:backgroundClip attribute is used to specify a style property that determines the background painting rectangle within which the background is painted.

Values:	`"border"` \| `"content"` \| `"padding"`
Initial:	`border`
Applies to:	`body`, `div`, `image`, `p`, `region`, `span`
Inherited:	no
Percentages:	N/A
Animatable:	discrete
Semantic basis:	backgroundClip derivation

The tts:backgroundClip style is illustrated by the following example.

Example Fragment – Background Clip