Previously in this series, we’ve set up a legacy service to output data that we can feed in to an AWS Lambda in order to create a report of test deployments. Today, let’s see how it all fits together, and how to create an Alexa skill to consume our reports.
Creating an Alexa skill
First, you’ll need an Amazon account and an Amazon developer account. If you want to test your skill on a live Echo (without publishing it), make sure you use the same Amazon account that an Echo device is registered to.
Log in to the Amazon Developer Portal, and under the Alexa tab, click ‘Get started’ on the Alexa Skills Kit entry. On the next page, you’ll want to create a new skill and enter some basic information about your application.
You might note that there’s an awful lot going on here – Interaction Model, Configuration, and so forth. For now, let’s gloss over a lot of these details and select ‘Custom Interaction Model’ and enter a Skill name and an Invocation name. The latter is how users will interact with your skill, in this case, someone would say “Alexa, ask Reportamatic…” and continue with their interaction from there. Let’s figure that out before we go any further.
Bootstrapping an Alexa skill
Technically, the only thing you need to do is create an Application that supports the requests from the Alexa service and responds appropriately, which leaves quite a bit of room for individual implementations in whatever language you might prefer. If you’re running on Lambda, you have several options – C#, Java 8, node.js 4.3, or Python 2.7. To speed up development of basic skills, there’s several frameworks that you can avail yourself of, including the alexa-app and alexa-app-server projects.
I don’t mind node, so let’s go ahead and use that. The full use of both of these packages is a little outside the scope of this post, but it’s not much harder than npm install alexa-app-server –save and creating new skills in your servers app path. Again, see the full documentation on GitHub for more details. The framework lets us quickly build intents and interaction models through extra parameters passed into the app.intent function. First things first, let’s create the application –
Our imports are fairly straightforward; the alexa-app framework, and the AWS SDK for node.js. module.change_code = 1 enables hot-reload of our module when executed in the alexa-app-server. Finally, we create an application and assign the Launch request. This is essentially the default command passed to an Alexa skill, and is triggered when a user invokes the skill without any other invocation. res.say sends a block of text back out to the Alexa service that will be translated into speech and output from the user’s Echo.
Now, behind the scenes, this is all just a bunch of requests coming and going. For instance, here’s the JSON for a LaunchRequest –
This is the basic format for requests from the Alexa service into your Lambda; Sessions are important if you’re dealing with conversations or multi-stage interactions, as you’ll need to read and write information from and to them to persist data between steps. The Request object itself is where you’ll find information such as intents, mapped utterances, and so forth. For comparison, here’s the request object for a specific intent.
Thankfully, we have a convenient way to deal with these requests in our framework – app.intent.
Ultimately, we’re simply taking an array of JSON like we defined all the way back in part one and searching for a name match. How does Alexa know what intent to call, though? That’s where the intent schema and sample utterances come in.
Another convenience of our library is it can work in conjunction with the alexa-app-server to automatically generate an intent schema. Intent schemas are essentially mappings that let the Alexa service know what request to send to your application in response to your voice. Here’s the schema for our SpecificReportIntent.
Pretty simple, yeah? What’s that EnvironmentName type, though? Alexa allows us to define a Custom Slot Type, a list of words it should try to match to. This improves voice recognition greatly, as the voice recognition attempts to map utterances to a known set of phonemes. We set up the Intent Schema, Custom Slot Types, and Sample Utterance back in the Amazon Developer Portal.
Take note! Your schema and custom type may be small, but your sample utterances will probably not be! Your utterances need to capture all of the ways a user might interact with your skill. One of the topics we haven’t touched on at all is developing a quality Voice UI (VUI), and if you’re planning on doing Alexa skills ‘for real’ then you should certainly invest quite a bit of time on designing the VUI. Utterances aren’t terribly discoverable, after all, and people from different cultural or educational backgrounds may say the same thing in subtly different ways.
Let’s finish our skill up with a final intent, one where we can get all of the available reports.
An important thing to point out – see that, at the end of our res.say call? Since the text that’s sent back is interpreted as SSML, you’re able to add various pauses or instructions for how it should be spoken.
At the end of our declarations, we need to export our application via module.exports = app; and then we’re done with node for the time being with node. To deploy your skill to Lambda, simply make a zip file of its package.json, node_modules, and all .js files in the folder, and upload it as a new Lambda service. This requires an AWS account, which again, is slightly outside the scope of this post. I will note that when you make the Lambda function, you’ll need to create a IAM role to execute the function under. Please see AWS documentation for more information on how to configure this role.
Back in the Amazon Developer Portal, one last thing to do. First, get the ARN ID of your Lambda function (upper-right corner of the Lambda page) and copy it. In the Developer Portal, under the ‘Configuration’ option, you’ll see a space to enter it.
With that, you’re pretty much done! You should be able to go into the test tab, send a sample request, and see the appropriate response. You should also be able to query an Echo device on your developer account with one of your intents and have it respond to you.
This is, of course, a pretty simple example – we didn’t implement a lot of sorting, filtering, or other conversational options on our data. Once I have time, I plan to add more information to the data from our internal systems, so that users can get more details (such as what tests passed or failed) and have conversations with the skill (rather than having it simply read out a list of items). However, I hope that you’ll take the ideas and samples in this series and use it to build something amazing for your team!