Understand the Different Skill Models

The first step in building a new skill is to decide what your skill will do. This determines how your skill integrates with the Alexa service and what you need to build. The Alexa Skills Kit supports building several different types of skills.

Skill Models

Every skill has an interaction model that determines the requests the skill can handle and the words users say to invoke those requests. You can define this yourself with a custom model. The Alexa Skills Kit also provides pre-built models in which the possible requests and utterances are pre-defined for you.

Custom Interaction Model

For the most control over the user's experience, build a skill with a custom interaction model. This is a custom skill.

For a custom skill, you (as the developer) define:

The requests the skill can handle. These are defined as intents. For example, a skill could do any of the following:

Look up tide information

Order a pizza

Request a taxi

Engage the user in a game, such as word puzzles or trivia

Just about any other action you can imagine!

The words users say to make (or invoke) those requests. This is the interaction model, and it provides the voice user interface by which users communicate with the skill. Continuing the above examples:

"Get high tide for Seattle" (this phrase would be mapped to a TideRequest intent).

"Order a large pepperoni pizza" (this phrase would be mapped to an OrderPizza intent).

"Order a car" (this phrase would be mapped to an OrderCar intent).

The visual and touch interactions that users will experience and can invoke. Alexa-enabled devices with a screen support visual displays and touch interactions, so you can create a skill that uses a combination of voice, visual, and touch interactions, or you can opt to have a skill that does not support any screen functionality.

The name Alexa uses to identify your skill, called the invocation name. Users include this when making a request. For example, the skill for looking up tides could be called "tide pooler".

Putting this all together, a user could say this:

User:get high tide for Seattle from Tide Pooler

Alexa understands this request and sends the TideRequest intent to the service for the Tide Pooler skill.

A custom skill can handle any kind of request, so long as you can create the code to fulfill the request and provide appropriate data in the interaction model to let users invoke the request. This is the most flexible kind of skill you can build, but also the most complex, since you need to provide the interaction model.

Smart Home Skills (Pre-built Model)

For building a skill to control smart home devices such as cameras, lights, locks, thermostats, and smart TVs, you should use the Smart Home pre-built model. This gives you less control over the user's experience, but simplifies development since you don't need to create the voice user interface yourself. These skills are also easier for end users to invoke, since they don't need to remember any invocation name and can make requests such as "Alexa, turn on the living room lights."

For this type of skill, the Smart Home Skill API defines:

The requests the skill can handle. These requests are called device directives. Examples include:

turn on / turn off

increase / decrease the temperature

change the dimness or brightness for a light

lock a door

change the channel on a television

view a live video stream from a smart home camera on Echo Show or Fire TV.

The words users say to make (or invoke) those requests. For example:

"turn off the living room lights"

"increase the temperature by two degrees"

"dim the living room lights to 20%"

"lock the back door"

"change channel to PBS"

"show the front door camera"

You (as the developer) define how your skill responds to a particular directive. For instance, you write the code that makes a light turn on when your skill receives a "turn on the light" directive. This code is hosted as an AWS Lambda function. Note that a skill built with the Smart Home Skill API can respond only to the requests (device directives) supported by the API.

Flash Briefing Skills (Pre-built Model)

A flash briefing skill is the only way that you can provide content for a customer's flash briefing.

For this type of skill, the Flash Briefing Skill API defines:

The words users say to make (or invoke) those requests. For example:

"give me my flash briefing"

"tell me the news"

You (as the creator) define:

The name, description and images for a flash briefing skill. This helps a customer choose your skill in the skill store.

One or more content feeds for a flash briefing skill. These feeds can contain audio content that is played to the customer or text content that Alexa reads to the customer.

Video Skills (Pre-built Model)

A video skill enables you to provide video content such as TV shows and movies for customers.

For this type of skill, the Video Skill API defines:

The words users say to make (or invoke) those requests. For example:

"play Manchester by the Sea"

"change to channel 4"

You (as the creator) define:

The name, description and images for a video skill. This helps a customer choose your skill in the skill store.

The requests the skill can handle such as playing and searching for video content and how video content search results display.

Music Skills (Pre-built Model)

A music skill enables you to provide audio content such as songs, playlists, or radio stations for Alexa users. For this type of skill, the Music Skill API handles the words (utterances) a user can say to request and control audio content, and turns these utterances into requests that are sent to your skill. Your skill code (AWS Lambda function) handles these requests and responds appropriately, sending back audio content for the user to listen to on an Alexa-enabled device.

List Skills

A list skill facilitates the use of list events in the skill service. Thus, the skill can understand and react to changes that happen to lists from top-level utterances on Alexa.