AudioPlayer Interface Reference

The AudioPlayer interface provides directives and requests for streaming audio and monitoring playback progression. Your skill can send directives to start and stop the playback. The Alexa service sends your skill AudioPlayer requests to give you information about the playback state, such as when the track is nearly finished, or when playback starts and stops. Alexa also sends PlaybackController requests in response to hardware buttons such as on a remote control or the next/previous touch controls on Alexa-enabled devices with a screen.

In addition to the required built-in intents, your skill should gracefully handle the following additional built-in intents:

AMAZON.CancelIntent

AMAZON.LoopOffIntent

AMAZON.LoopOnIntent

AMAZON.NextIntent

AMAZON.PreviousIntent

AMAZON.RepeatIntent

AMAZON.ShuffleOffIntent

AMAZON.ShuffleOnIntent

AMAZON.StartOverIntent

Note that users can invoke these built-in intents without using your skill's invocation name (see below). If your skill is currently playing audio, or was the skill most recently playing audio, these intents are automatically sent to your skill. Your code needs to expect them and not return an error.

If any of these intents do not apply to your skill, handle it in a graceful way in your code. For instance, you could return a response with text-to-speech indicating that the command is not relevant to the skill. The specific message depends on the skill and whether the intent is one that might make sense at some point, for example:

For a podcast skill, the AMAZON.ShuffleOnIntent intent might return the message: "I can't shuffle a podcast."

For version 1.0 of a music skill that doesn't yet support playlists and shuffling, the AMAZON.ShuffleOnIntent intent might return: "Sorry, I can't shuffle music yet."

Note: If your skill uses the AudioPlayer directives, you cannot extend the above built-in intents with your own sample utterances.

Built-in Intents for Playback Control

When your skill sends a Play directive to begin playback, the Alexa service sends the audio stream to the device for playback. Once the session ends normally (for instance, if your response included the shouldEndSession flag set to true), Alexa remembers that your skill started the playback until the user does one of the following:

Invokes audio playback with a different skill.

Invokes another service that streams audio, such as the built-in music service or the flash briefing.

Reboots the device.

During this time, users can invoke the following built-in playback control intents without using your skill's invocation name:

AMAZON.CancelIntent

AMAZON.LoopOffIntent

AMAZON.LoopOnIntent

AMAZON.NextIntent

AMAZON.PauseIntent

AMAZON.PreviousIntent

AMAZON.RepeatIntent

AMAZON.ResumeIntent

AMAZON.ShuffleOffIntent

AMAZON.ShuffleOnIntent

AMAZON.StartOverIntent

For example, note this scenario for a custom skill called "My Podcast Player". This skill defines an intent PlayLatestEpisode mapped to the sample utterance "play the latest episode."

User: Alexa, ask My Podcast Player to play the latest episode.Alexa opens a new skill session and sends the My Podcast Player skill the normal PlayLatestEpisode.

My Podcast Player sends a Play directive. The skill session closes and audio begins playing.

User: Alexa, next.(note no invocation name used.)Alexa opens a new skill session and sends the My Podcast Player skill AMAZON.NextIntent.

My Podcast Player takes appropriate action for 'next' and closes the skill session.

User: Alexa, pause.(again, no invocation name.)Alexa opens a new skill session and sends the skill AMAZON.PauseIntent.

My Podcast Player sends a Stop directive and closes the skill session. The audio is stopped.

Although at this point the audio is not playing and there is no current session, the Alexa service is still tracking My Podcast Player as the skill that most recently streamed audio. Assuming the device remains on and the user does not use any other audio streaming skills or services, the following could take place at any time later:

User: Alexa, resume.(note no invocation name used.)Alexa opens a new skill session and sends My Podcast Player the AMAZON.ResumeIntent.

My Podcast Player takes appropriate action to determine the previously playing track and send a new Play directive to restart playback.

This only applies to the built-in intents. The intents you define (such as the example PlayLatestEpisode intent) must be invoked using a normal invocation phrase.

Note: In the above scenario, when your skill is not in an active session but is playing audio, or was the skill most recently playing audio, utterances such as 'stop' send your skill an AMAZON.PauseIntent instead of an AMAZON.StopIntent.

AudioPlayer on Alexa-enabled devices with a screen

These sections describe how an audio skill looks and behaves when used on an Alexa-enabled device with a screen.

Note that this does not require you to include the Display interface. Devices with screens handle audio playback automatically.

Custom and Default AudioPlayer Display

AudioPlayer has a visual appearance on Alexa-enabled devices with a screen. You can optionally provide album art, a background image, track title, and subtitle when you send the Play directive. In this case, the devices display the metadata as shown below. If no metadata is provided, the devices display a default player with a plain background and the skill's name.

For details about the metadata you can include, see the audioItem.metadata property in the Play directive.

Note that the AudioPlayer display shows touch controls (next, previous, and pause) when the user touches the device screen. Echo Show and Echo Spot are shown here, and Fire TV Cube is also supported.

Echo Show

Echo Spot

AudioPlayer with custom background image, title, subtitle, and album art

AudioPlayer with custom album art, title, and subtitle (Echo Spot does not use the separate background image)

The pause control automatically stops the playback without sending your skill a request. However, your skill should still handle PlaybackController.PauseCommandIssued, because other devices (such as hardware remotes) do send those requests.

When including a directive in your response, set the type property to the directive you want to send. Include directives in the directives array in your response:

{"version":"1.0","sessionAttributes":{},"response":{"outputSpeech":{},"card":{},"reprompt":{},"shouldEndSession":true,"directives":[{"type":"AudioPlayer.Play","playBehavior":"ENQUEUE","audioItem":{"stream":{"url":"https://url-of-the-mp3-to-play/audiofile.mp3","token":"1234AAAABBBBCCCCCDDDDEEEEEFFFF","expectedPreviousToken":"9876ZZZZZZZYYYYYYYYYXXXXXXXXXXX","offsetInMilliseconds":0},"metadata":{"title":"My opinion: how could you diss-a-brie?","subtitle":"Vince Fontana","art":{"sources":[{"url":"https://url-of-the-skill-image.com/brie-album-art.png"}]},"backgroundImage":{"sources":[{"url":"https://url-of-the-skill-image.com/brie-background.png"}]}}}}]}}

Tip: When responding to a LaunchRequest or IntentRequest, your response can include both AudioPlayer directives and standard response properties such as outputSpeech, card, and reprompt. For example, if you provide outputSpeech in the same response as an Play directive, Alexa speaks the provided text before beginning to stream the audio. Note that the rules are different when responding to AudioPlayer requests.

Note that the Alexa Simulator on the Test page does not render the audio playback, but the Skill I/O section shows the AudioPlayer directives sent from your skill. Since the playback does not occur, you cannot use the Alexa Simulator to test AudioPlayer requests that are triggered by events in the playback, such as PlaybackNearlyFinished.

AudioPlayer Requests

AudioPlayer sends the following requests to notify your skill about changes to the playback state:

Note: When responding to AudioPlayer requests, you can only respond with AudioPlayer directives. The response cannot include any of the standard properties such as outputSpeech. In addition, some requests limit the directives you can use, such as not allowing Play. Sending a response with unsupported properties causes an error. See the request types below for the limits on each request.
.

Also note that your service is not required to return a response to the AudioPlayer requests.

Play Directive

Sends Alexa a command to stream the audio file identified by the specified audioItem. Use the playBehavior parameter to determine whether the stream begins playing immediately, or is added to the queue.

Note: You can only send one Play directive in a request.

When sending a Play directive, you normally set the shouldEndSession flag in the response object to true to end the session. If you set this flag to false, Alexa sends the stream to the device for playback, then immediately pauses the stream to listen for the user's response.

{"type":"AudioPlayer.Play","playBehavior":"valid playBehavior value such as ENQUEUE","audioItem":{"stream":{"url":"https://url-of-the-stream-to-play","token":"opaque token representing this stream","expectedPreviousToken":"opaque token representing the previous stream","offsetInMilliseconds":0},"metadata":{"title":"title of the track to display","subtitle":"subtitle of the track to display","art":{"sources":[{"url":"https://url-of-the-album-art-image.png"}]},"backgroundImage":{"sources":[{"url":"https://url-of-the-background-image.png"}]}}}}

Parameters

Parameter

Description

Type

Required

type

Set to AudioPlayer.Play.

string

yes

playBehavior

Describes playback behavior. Accepted values:

REPLACE_ALL: Immediately begin playback of the specified stream, and replace current and enqueued streams.

ENQUEUE: Add the specified stream to the end of the current queue. This does not impact the currently playing stream.

REPLACE_ENQUEUED: Replace all streams in the queue. This does not impact the currently playing stream.

string

yes

audioItem

Contains an object providing information about the audio stream to play.

object

yes

audioItem.stream

Contains an object representing the audio stream to play.

object

yes

audioItem.stream.url

Identifies the location of audio content at a remote HTTPS location.

The audio file must be hosted at an Internet-accessible HTTPS endpoint. HTTPS is required, and the domain hosting the files must present a valid, trusted SSL certificate. Self-signed certificates cannot be used. Many content hosting services provide this. For example, you could host your files at a service such as Amazon Simple Storage Service (Amazon S3) (an Amazon Web Services offering).

The supported formats for the audio file include AAC/MP4, MP3, and HLS. Bitrates: 16kbps to 384 kbps.

This property is required and allowed only when the playBehavior is ENQUEUE. This is used to prevent potential race conditions if requests to progress through a playlist and change tracks occur at the same time. For details, see Playlist Progression with ENQUEUE.

string

yes (when playBehavior is ENQUEUE)

audioItem.stream.offsetInMilliseconds

The timestamp in the stream from which Alexa should begin playback. Set to 0 to start playing the stream from the beginning. Set to any other value to start playback from that associated point in the stream.

Playlist Progression with ENQUEUE

The audioItem.stream.expectedPreviousToken property is required if playBehavior is ENQUEUE to handle situations in which requests to progress through a playlist and change tracks happen at the same time. For example:

The skill is streaming track 2 in a playlist of several tracks.

The user says "Alexa, go back," which sends an AMAZON.PreviousIntent.

At about the same time, track 2 is nearly finished, so Alexa sends a PlaybackNearlyFinished request.

The skill handles the AMAZON.PreviousIntent first and sends a new Play directive with track 1. This track begins playing. The already-sent PlaybackNearlyFinished request is now outdated, since it assumed that track 2 was playing.

The skill handles the now-outdated PlaybackNearlyFinished request and sends a Play directive with track 3, since this is the next track after the originally playing track 2. This request includes expectedPreviousToken set to track 2.

The expectedPreviousToken provided in the directive does not match the token for the actively playing stream, so the device ignores this directive.

As track 1 finishes, Alexa sends a PlaybackNearlyFinished request. The skill responds with a Play directive for track 2. This track begins playing once track 1 finishes.

If this check was not in place, the directive sent in step 5 would put track 3 on the queue, which would cause the audio to skip from track 1 to track 3 when track 1 finishes.

Note: Including audioItem.stream.expectedPreviousToken when playBehavior is any other value (REPLACE_ALL or REPLACE_ENQUEUED) causes an error.

Guidelines for images for Alexa-enabled devices with a screen

If you provide images in the audioItem.metadata.art and audioItem.metadata.backgroundImage properties, note the following guidelines:

For the audioItem.metadata.art, use a square image for the best results. If the image is not square, it is displayed with extra black space on the device. Note that the image is cropped to a circle shape on the Echo Spot.

Provide the minimum recommended size as noted below to ensure that the image is never scaled up. If you provide a smaller image, the device must scale it up, which can make the image appear blurry.

The Image object lets you provide multiple image URLs in the source array. As with the Display Interface, the device selects the image with the highest resolution to display.

The following properties for a particular image source on the Image object are not used when displaying the background image and album art for audio and can be left out of the object:

contentDescription

size

widthPixels

heightPixels

Important: The metadata for a given audio stream is identified by the audioItem.stream.token included in the Play directive. Note that the metadata associated with a particular audioItem.stream.token may be cached in the Alexa service for up to five days, so changes to the metadata (such as a different image, or a change to the title text) may not be reflected immediately on the device. For instance, you may notice this when testing if you experiment with different images or title text for the same audio stream. You can send a new Play directive with a differentaudioItem.stream.token to clear the cache.

The following table notes the recommended minimum size for images used with Alexa-enabled devices with a screen.

Image

Recommended Minimum Size

Echo Show/Fire TV Cube

Echo Spot

Art image (audioItem.metadata.art)

480 x 480 pixels

Scaled to 300 x 300 and displayed as album art.

Scaled to 480 x 480, cropped to a circle, and displayed as the background image with 70% opacity black scrim.

Background image (audioItem.metadata.backgroundImage)

1024 x 640 pixels

Scaled to 1024 x 640 and displayed as a background image. Your image is displayed as is on the Echo Show or Fire TV Cube, so apply any fading effects in your source image if needed. For instance, you could apply a 70% opacity black layer over your image to give it a faded appearance and make the text stand out.

Not used.

Stop Directive

Stops the current audio playback.

{"type":"AudioPlayer.Stop"}

Parameter

Description

Type

Required

type

Set to AudioPlayer.Stop

string

yes

ClearQueue Directive

Clears the audio playback queue. You can set this directive to clear the queue without stopping the currently playing stream, or clear the queue and stop any currently playing stream.

{"type":"AudioPlayer.ClearQueue","clearBehavior":"valid clearBehavior value such as CLEAR_ALL"}

Parameter

Description

Type

Required

type

Set to AudioPlayer.ClearQueue.

string

yes

clearBehavior

Describes the clear queue behavior. Accepted values:

CLEAR_ENQUEUED: clears the queue and continues to play the currently playing stream

CLEAR_ALL: clears the entire playback queue and stops the currently playing stream (if applicable).

string

yes

PlaybackStarted Request

Sent when Alexa begins playing the audio stream previously sent in a Play directive. This lets your skill verify that playback began successfully.

This request is also sent when Alexa resumes playback after pausing it for a voice request.

{"type":"AudioPlayer.PlaybackStarted","requestId":"unique.id.for.the.request","timestamp":"timestamp of request in format: 2018-04-11T15:15:25Z","token":"token representing the currently playing stream","offsetInMilliseconds":0,"locale":"a locale code such as en-US"}

PlaybackFinished Request

{"type":"AudioPlayer.PlaybackFinished","requestId":"unique.id.for.the.request","timestamp":"timestamp of request in format: 2018-04-11T15:15:25Z","token":"token representing the currently playing stream","offsetInMilliseconds":0,"locale":"a locale code such as en-US"}

PlaybackStopped Request

Sent when Alexa stops playing an audio stream in response to one of the following AudioPlayer directives:

Stop

Play with a playBehavior of REPLACE_ALL.

ClearQueue with a clearBehavior of CLEAR_ALL.

This request is also sent if the user makes a voice request to Alexa, since this temporarily pauses the playback. In this case, the playback begins automatically once the voice interaction is complete.

Note: If playback stops because the audio stream comes to an end on its own, Alexa sends PlaybackFinished instead of PlaybackStopped.

{"type":"AudioPlayer.PlaybackStopped","requestId":"unique.id.for.the.request","timestamp":"timestamp of request in format: 2018-04-11T15:15:25Z","token":"token representing the currently playing stream","offsetInMilliseconds":0,"locale":"a locale code such as en-US"}

Valid Response Types

Your skill cannot return a response to PlaybackStopped.

PlaybackNearlyFinished Request

Sent when the device is ready to add the next stream to the queue.

To progress through a playlist of audio streams, respond to this request with a Play directive for the next stream and set playBehavior to ENQUEUE or REPLACE_ENQUEUED. This adds the new stream to the queue without stopping the current playback. Alexa begins streaming the new audio item once the currently playing track finishes.

{"type":"AudioPlayer.PlaybackNearlyFinished","requestId":"unique.id.for.the.request","timestamp":"timestamp of request in format: 2018-04-11T15:15:25Z","token":"token representing the currently playing stream","offsetInMilliseconds":0,"locale":"a locale code such as en-US"}

PlaybackFailed Request

Sent when Alexa encounters an error when attempting to play a stream.

Syntax

{"type":"AudioPlayer.PlaybackFailed","requestId":"unique.id.for.the.request","timestamp":"timestamp of request in format: 2018-04-11T15:15:25Z","token":"token representing the currently playing stream","offsetInMilliseconds":0,"locale":"a locale code such as en-US","error":{"type":"error code","message":"description of the error that occurred"},"currentPlaybackState":{"token":"token representing stream playing when error occurred","offsetInMilliseconds":0,"playerActivity":"player state when error occurred, such as PLAYING"}}

This request type includes two token properties – one as a property of the request object, and one as a property of the currentPlaybackState object. The request.token property represents the stream that failed to play. The currentPlaybackState.token property can be different if Alexa is playing a stream and the error occurs when attempting to buffer the next stream on the queue. In this case, currentPlaybackState.token represents the stream that was successfully playing.

System.ExceptionEncountered Request

If a response to an AudioPlayer request causes an error, your skill is sent a System.ExceptionEncountered request. Any directives included in the response are ignored.

{"type":"System.ExceptionEncountered","requestId":"unique.id.for.the.request","timestamp":"timestamp of request in format: 2018-04-11T15:15:25Z","locale":"a locale code such as en-US","error":{"type":"error code such as INVALID_RESPONSE","message":"description of the error that occurred"},"cause":{"requestId":"unique identifier for the request that caused the error"}}