The TemplateRuntime interface provides visual metadata to AVS-enabled products with GUI support. These display cards are used to describe or enhance a user's voice interactions. Metadata is provided as structured JSON and should be bound to templates that adhere to design guidelines for each supported device type.

Data Flow

This diagram illustrates the high-level message flow for delivering visual metadata to an AVS-enabled product.

Click to enlarge

A user asks, "Who is Usain Bolt?". Their speech is captured by your product and streamed to AVS.

AVS returns two directives:

A Speak directive that instructs your client to play Alexa TTS.

A RenderTemplate directive that instructs your client to display visual metadata – in this case, information about Usain Bolt.

Playback of Alexa TTS starts.

The RenderTemplate directive is rendered immediately (and if possible, in tandem with the Speak directive) in a separate thread.

Your client informs AVS that your product has started to playback Alexa TTS by sending a SpeechStarted event.

When playback of Alexa TTS finishes, a SpeechFinished event is sent to AVS.

Rendering Visual Metadata

In addition to the guidance provided in the Interaction Model, these rules must be enforced by your product when rendering visual metadata:

Read the response on the request thread and parse the directives:

Immediately execute directives without a dialogRequestId on a new thread.

Immediately execute RenderTemplate directives on a new thread.

Place directives with a dialogRequestId in your queue.

Directives in the queue should be picked up on a separate thread and handled sequentially.

Play directives and associated RenderPlayerInfo directives must be in sync. Unlike RenderTemplate, the directive should not always be rendered immediately, but should match the sequence of Play directives. For example, after sending PlaybackNearlyFinished, if you receive a new Play directive and RenderPlayerInfo directive these must be added to the queue and handled when the currently playing track has finished. This means that display card implementation must be aware of playback state, such as playing, stopped, or paused.

Capabilities API

To use version 1.0 of the TemplateRuntime interface, it must be declared in your call to the Capabilities API. For additional details, see Capabilities API.

RenderTemplate Directive

The Render directive instructs your client to display visual metadata associated with a user's request. For example, when a user asks Alexa, "Who is Usain Bolt?". In addition to sending a Speak directive, AVS will send a Render directive with visual metadata that your client will bind to a template and render for the end user.

A unique ID used to correlate directives sent in response to a specific Recognize event.

string

Payload Parameters

Parameter

Description

Type

token

An opaque token provided by Alexa.

string

type

Identifies the template. In this example, type is set to BodyTemplate1.

string

title

Contains key/value pairs for title information, such as title and subtitle. Actual key/value pairs vary by template.

object

title.mainTitle

The title.

string

title.subTitle

The subtitle.

string

skillIcon

The icon/logo for the skill delivering metadata. This is an optional parameter for the content provider and may not be included in the JSON (or may have a null value). The image structure contains information such as url, size, widthPixels and heightPixels. For more information, see image structure below.

A unique ID used to correlate directives sent in response to a specific Recognize event.

string

Payload Parameters

Parameter

Description

Type

token

An opaque token provided by Alexa.

string

type

Identifies the template. In this example, type is set to BodyTemplate1.

string

title

Contains key/value pairs for title information, such as title and subtitle. Actual key/value pairs vary by template.

object

title.mainTitle

The title.

string

title.subTitle

The subtitle.

string

skillIcon

The icon/logo for the skill delivering metadata. This is an optional parameter for the content provider and may not be included in the JSON (or may have a null value). The image structure contains information such as url, size, widthPixels and heightPixels. For more information, see image structure below.

A unique ID used to correlate directives sent in response to a specific Recognize event.

string

Payload Parameters

Parameter

Description

Type

token

An opaque token provided by Alexa.

string

type

Identifies the template. In this example, type is set to BodyTemplate1.

string

title

Contains key/value pairs for title information, such as title and subtitle. Actual key/value pairs vary by template.

object

title.mainTitle

The title.

string

title.subTitle

The subtitle.

string

skillIcon

The icon/logo for the skill delivering metadata. This is an optional parameter for the content provider and may not be included in the JSON (or may have a null value). The image structure contains information such as url, size, widthPixels and heightPixels. For more information, see image structure below.

A unique ID used to correlate directives sent in response to a specific Recognize event.

string

Payload Parameters

Parameter

Description

Type

token

An opaque token provided by Alexa.

string

type

Identifies the template. In this example, type is set to BodyTemplate1.

string

title

Contains key/value pairs for title information, such as title and subtitle. Actual key/value pairs vary by template.

object

title.mainTitle

The title.

string

title.subTitle

The subtitle.

string

skillIcon

The icon/logo for the skill delivering metadata. This is an optional parameter for the content provider and may not be included in the JSON (or may have a null value). The image structure contains information such as url, size, widthPixels and heightPixels. For more information, see image structure below.

RenderPlayerInfo Directive

The RenderPlayerInfo directive instructs your client to display visual metadata associated with a media item, such as a song or playlist. In addition to sending a Play directive, AVS will send a RenderPlayerInfo directive with visual metadata specific to an audio content provider that your client will bind to a template and render for the end user.

Important: Now Playing visual metadata must be rendered to the specification provided in our UX Design Overview.

The name of the control. All controls included in the array must be rendered. Accepted values: PLAY_PAUSE, NEXT, PREVIOUS, SKIP_FORWARD, SKIP_BACKWARD, SHUFFLE, LOOP, THUMBS_UP, THUMBS_DOWN.

string

controls.enabled

Informs the client if the control is clickable. The value is true when the control can be clicked by the user.

boolean

controls.selected

Indicates that a control should render in a selected state. For example, if a user has favorited a song, when this song plays, the control that represents favorite will have a selected value of true.

boolean

Control to Event Mapping

When a user interacts with an on-screen control an event must be sent to Alexa using the PlaybackController interface. This table maps controls to the events in the PlaybackController interface that must be sent:

Control Type

Control Name

Event

Notes

BUTTON

PLAY_PAUSE

PlayCommandIssued

n/a

BUTTON

NEXT

NextCommandIssued

n/a

BUTTON

PREVIOUS

PreviousCommandIssued

n/a

BUTTON

SKIP_FORWARD

ButtonCommandIssued

The control is specified in the event payload.

BUTTON

SKIP_BACKWARD

ButtonCommandIssued

The control is specified in the event payload.

TOGGLE

SHUFFLE

ToggleCommandIssued

The control is specified in the event payload.

TOGGLE

LOOP

ToggleCommandIssued

The control is specified in the event payload.

TOGGLE

THUMBS_UP

ToggleCommandIssued

The control is specified in the event payload.

TOGGLE

THUMBS_DOWN

ToggleCommandIssued

The control is specified in the event payload.

Image Structure

Important: Content providers are only required to send the image source.url. All other fields are optional for content providers, however, your client should be prepared to handle these fields if present. Please Note: sources is a list of images and it is possible for this list to be a single source or multiple sources for the same image. There is no guarantee that multiple sources will be provided.

The image size as an enumerated value. This is an optional parameter for the content provider and may not be included in the JSON. If widthPixels and/or heightPixels are not provided, render to the specification provided below. Accepted values:X-SMALL, SMALL, MEDIUM, LARGE and X-LARGE.

string

sources[i].widthPixels

Image width in pixels. This is an optional parameter for the content provider and may not be included in the JSON.

long

sources[i].heightPixels

Image height in pixels. This is optional parameter for the content provider and may not be included in the JSON.