Alexa, Amazon’s voice assistant, allows developers to build their own interactions (called skills) in a powerful, if sometimes complex, platform. While this post is not meant to be an introduction to building skills, I’ll introduce a few of the basics and then focus on the specifics of building skills with the Webtask platform.

Getting Started

As I mentioned above, this post is not meant to walk you through the learning experience of building for Alexa. Amazon has some great tutorials for that and I recommend you spend some time going through their docs: Alexa Skills Kit - Tutorials. It may be useful though to cover what it’s like, at a high level, to build for the platform.

First off - Alexa development is really split into two aspects. The first part, and actually the most difficult, is defining your interactions at a meta level. What do I mean by that? Think about your interactions with someone at a coffee store. In general, they probably break down to a few things:

  • What do you sell?
  • Can I get a X? (Where X is some menu item.)
  • What are your hours? (Ok this is probably not something you would ask if you were already in the store itself.)

Alexa defines these as “Intents”. You may define the above interactions as intents with these names:

  • Menu
  • OrderItem
  • Hours

Each intent will have multiple sample utterances. These are basically examples of an intent. So for example, the Menu intent may have these utterances.

  • What’s on your menu?
  • What do you sell?
  • What can I buy here?
  • Oh my God I need coffee do you sell coffee?

You aren’t required to list every single possible known utterance. Alexa will use what you provide as an example and is intelligent enough to figure out what you mean. So for example - Alexa would probably match “What exactly do you sell here?” even though it isn’t listed. But the more you provide, the better Alexa can do.

Finally, Alexa has a concept of “slots”, which are basically variables. So the OrderItem intent may have sample utterances that look like so:

“I’d like to order .” “Give me !”

You define what is meant by “item” by providing a simple list: coffee, tea, milk, beer. But Alexa also supports an incredible amount of built in slots. This allows you to do things like matching a movie title without having to define every known movie created.

That’s a pretty rough and quick introduction, but here’s the important bit. After you do this work to define the nature of your interaction with the skill, Alexa can then break it down into simple imperative commands that can be sent to your code. You don’t have to worry about parsing English. Nope, you’re code will get a nice JSON packet that says an order was placed and the item was beer.

Also, and this is a bit of an aside, you can develop Alexa skills just for your own use! That means you can build all kinds of fun and wacky stuff for your family. Amazon recently built upon this idea with the release of Blueprints, which lets anyone build simple skills for their own devices right from the web. It’s a fabulous idea and I’m really happy to see Amazon doing this.

Working with Alexa and Webtask

Ok, so now let’s talk about some specifics for working with Alexa and Webtask. First let’s talk about the interaction at the service level. Alexa will send information about a request via a JSON object passed in the body of the request. This can be read from the context.body object. Here’s a complete, if somewhat simple, skill.


module.exports = function(context, cb) {

	let intent = 'menu';
	
	if(context.body.request.intent) intent = context.body.request.intent.name;
	
	let response = {
		"version":"1.0",
		"response":{
			"shouldEndSession":true,
			"outputSpeech": {
				"type":"PlainText"
			}
		}
	};

	if(intent === "menu") {
		response.response.outputSpeech.text = "We sell tea, espresso, and beer.";
	} else if(intent === "hours") {
		response.response.outputSpeech.text = "We are open 4 AM to 9 PM.";
	} else if(intent === "order") {
		//what did they order?
		let order = context.body.request.intent.slots.item.value;
		response.response.outputSpeech.text = `I'll get that ${order} to you right away!`;
	}

	cb(null, response);
}

Note that I default my intent value. This is because Alexa skills support an “open” request where no specific intent is requested. Alexa sends information in a JSON packet and you return JSON as well. I’ve got a JSON object with my defaults and the only thing I change is the actual text. (As an aside, you can return more than just simple text. You can return graphical information that is returned to the video-enabled Alexa devices and Alexa apps. You can also return MP3 audio.) You can also seen an example of a dynamic slot in the last branch. I grab the exact item ordered and use that in the response.

In a nut shell though - code wise there isn’t anything really special you have to do for Webtask and that’s great. I do recommend testing with the Logs panel open as it’s helpful to see the incoming requests.

There is one thing you’ll need to do on the Alexa side in order to properly talk to the Webtask endpoint. Under “Endpoint”, when you paste in the URL of your webtask, be sure to select: “My development endpoint is a sub-domain of a domain that has a wildcard certificate from a certificate authority.” If you do not select this, then Alexa will not be able to properly route requests to your code.

Publishing Your Skill

Everything described previously will work for testing your skill, or creating a skill just for your home devices. But what about going live? First, be sure to read Amazon’s docs on the process. It’s a bit intense. I’ve gone through it a few times now and I can say that their testers will find every single possible mistake you’ve made. The good news though is that their rejection letters are incredibly well done. They do a great job explaining what you did wrong and how to reproduce the issue.

However - part of the certification process requires you to add a level of security to your code. This security basically ensures that you are only responding to properly authenticated Alexa requests. It’s a rather exhaustive amount of checks (and please, read the docs if you don’t believe me), but luckily there is a nice npm package that wraps this up for us - alexa-verifier.

The verifier package makes it near trivial to wrap your code with the proper checks, but wouldn’t it be nice to separate your webtask from the verification process? This is where middleware comes in. Like middleware in Express apps, a middleware for your task will run before it and either continue on the request to your code or throw an error.

Let’s look at how such a middleware could work. Here’s one example. (I want to give huge thanks to my coworker, Geoff Goodman for his help with this.)

const alexaVerifier = require('alexa-verifier');
const rawBody = require('raw-body');

module.exports = function() {

    return function alexaVerifierMiddleware(req, res, next) {
		
      let signaturechainurl = req.headers.signaturecertchainurl;
      let signature = req.headers.signature;

      return rawBody(req, {encoding:'utf-8'}, (err, body) => {
        alexaVerifier(signaturechainurl, signature, body, function(err) {
          if(err) return next(err);
          req.webtaskContext.body = JSON.parse(body);
          return next();
        });
      });

    };
}

So what’s going on here? Middleware for Webtask and Extend must return a function that implements the logic of the middleware itself. So in English, I’ve got code that returns code. So far so good. In order to use alexa-verifier, you need to grab two values from the header, which is easy enough, but then you need to get the raw body of the request. As we aren’t in an Express app but just a simple Node executable, I used the NPM package raw-body to process it.

Once I have the body and the various headers, I simply let alexa-verifier handle it. If there is an error, I return it. Otherwise I carry on by returning next(), however, make note of this line:

req.webtaskContext.body = JSON.parse(body);

This line (and again, thank you to Geoff!) handles keeping the original body value in the request so that my webtask can access it normally later on. Without it, it would be lost.

Actually using this middleware is rather simple as I’ve published it to NPM: https://www.npmjs.com/package/alexa-verifier-webtask. If you are using the Canary editor for Webtask, go to the settings for your task, select Middleware, then NPM modules, and enter alexa-verifier-webtask. If you using the CLI, simply provide the –middleware argument:

wt create myCoolAlexaSkill –middleware=alexa-verifier-webtask

You’ll then want to test your skill again to ensure everything is working safely. You should not have to change anything in your webtask code and it should be ready for verification for Alexa.