Create WebVTT or TTML files with Caption Maker

In this article

What are WebVTT or TTML files?

Before you can create caption-enabled video webpages, you need to have caption (or text track) files to use. WebVTT or TTML format files are text files that can be displayed by the Internet Explorer 10 video player, or captured using scripting. Text track files can be created by hand, or built using authoring tools. Next, we take a look at the file formats.

The WebVTT format

WEBVTT
00:00:01.878 --> 00:00:05.334
Good day everyone, my name is John Smith
00:00:08.608 --> 00:00:15.296
This video teaches you how to
build a sand castle on any beach.

The file starts with "WEBVTT" as the first line, followed by a line feed. Timing cues are in the format "HH:MM:SS.sss". The start and end time cues are separated by a single space, two hyphens and a greater than sign ( --> ), and then another space. The timing cues are on a line by themselves followed by a line feed. Immediately following the time cues is the caption text. Text captions can be one or more lines. The MIME type that is used on a server is "text/vtt".

The TTML format

Internet Explorer 10 and Windows Store apps using JavaScript use a subset of the TTML file format, which is defined in the TTML specification. The TTML format is an XML based language and the full spec is quite extensive. Windows Internet Explorer and Windows Store apps using JavaScript support a subset of the full specification that follows this structure.

The TTML file includes the XML version and encoding type, namespace declaration, and the language in the root element ("<tt>"). This is followed by the" <body>" and a "<div>" element. Within the "<div>" element are the timing cues. The actual times are set as attributes (begin, end) of the opening paragraph tag (<p>) and the text is delineated by the closing </p> tag. Blank lines and white space are ignored. If there are multiple lines, they are defined by <br/> tags. The MIME type for TTML files is application/ttml+xml. See Section 5.2 of the TTML specification for more info.

Using HTML5 Video Caption Maker

Like the video file itself, the HTML5 video player needs well-constructed track files, which can be challenging when done by hand. The HTML5 Video Caption Maker test drive demo is a simple, but effective tool to create either WebVTT or TTML format files.

The tool lets you view an mp4 format video file, set cues and add text captions, and save the file you've created. You can also load files you've already created and edit them. The HTML5 Video Caption Maker runs in either Internet Explorer 10 or Windows Internet Explorer 9 . However, the track element is only supported in Internet Explorer 10 currently.

To get started, run HTML5 Video Caption Maker in your browser. To get a feel for the tool, click the Load Sample buttons to load sample video and timed text files. The video player has the built-in controls enabled, so you can click Play in the player, or use the Play button below the player.

Loading your video file

The easiest way to work with video and text track files is to save them locally. To load your video file into the caption maker follow these steps:

Type or paste the path to a local .mp4 file into the Enter URL of video file: field.

Click Load. Depending on the size of the video, you'll see the file appear in the video player in a few seconds.

Click the Load Existing Caption File [optional] link. This expands the Enter URL of caption file: field to let you load the sample timed text file, or type your own path. In Internet Explorer 10, you can browse for a file.

Building a timed text file from your video

To create a caption file, play the video and pause to type a caption for the segment you've just viewed. You can then save and go on to the next segment. HTML5 Video Caption Maker automatically records the start and end times.

The timed-text standard doesn't specify that start and end times need to be contiguous. They can have gaps in time, or even overlap times, and the video player will still read the file. However, to make things simple, HTML5 Video Caption Maker creates timed text files that have the start and end times that are contiguous. If you want a section without captions, leave the caption blank but save the segment.

To start, click Play to begin the video. As the video plays, you'll see a message that updates with the current in and out time points. For example, when you start, you might see the message: "Pause to enter caption for segment from 00:00.000 to 00:05.340:" below the video player.

While the video is playing, the caption field is disabled. When you click Pause, the video stops, the field is enabled, and you can type your caption. This is a multi-line text field, so you can cut and paste, insert, or delete text as you would any form field. You can press "Enter" to insert a line break to create multi-line captions. HTML5 Video Caption Maker automatically formats multiple line captions for the file format you choose.

To save the caption, click Save or Save & Continue. Clicking Save saves the caption, but doesn't restart the video. You'll have to click Play to continue to the next segment. However, you can edit the caption and re-save if you haven't started playing the video again. Save & Continue saves the caption, and restarts the video immediately. This option enables you to quickly work through a video with the fewest clicks.

To edit a saved caption, in the list in the right pane, click the one you want to change. The video player immediately replays the section, and then lets you edit the caption. Click Save to change the caption as it's saved in the caption list. You can also click Save & Continue, which goes onto the next caption so you can to work your way through the captions from that point.

While the HTML5 Video Caption Maker lets you edit captions, you can't change the times in the tool currently. However, after you save the file, you can open it in a text editor like Notepad and make changes.

Note The time format that HTML5 Video Caption Maker uses has leading zeros for hours, minutes, and seconds, and three places to the right of the decimal point for fractional seconds. It's best to keep the same time format if you edit the times in a text editor. While fractional seconds are expressed in thousandths of a second, when you start writing webpages, you'll find that the video player only fires ontimeupdate events at approximately four times a second. See Using HTML5 video events for more info.

Saving your timed text files

When you've finished creating captions for your video, you can either save your file as a WebVTT or as a TTML format file. To pick a file format, click the appropriate radio buttons under Choose caption file format, located below the caption field.

The following screen shot shows selecting a WebVTT format file. The WebVTT format is a relatively simple format that's easy to read:

This screen shows a TTML format file. The TTML file is XML based, and contains a little more info than the WebVTT format. Internet Explorer 10 can understand either format providing they're both properly formed. Notice that below the timed text format is a code snippet that shows an example of the video and track tags you can use in your app.

After you've selected a caption file format, follow these steps:

Click Copy to Clipboard.

Click Allow access in the dialog box that appears to give Internet Explorer access to the clipboard.

Open a text editor like Notepad and paste the contents.

Save using the editor's Save as option so you can specify a file name.

If you chose TTML, add the file extension ".ttml".

If the file is WebVTT, use the extension ".vtt".

In Notepad, change the Encoding option below the file path field to UTF-8.

You should also change the Save as type: to All files (*.*) to avoid adding a ".txt" extension on as well.

Server considerations

If you host your own media content, add the following mime types to your server's configuration to ensure that timed text files are served correctly.