Posts

This post is the second in a series on getting started with the AWS DeepLens. In Part 1, we introduced a program that could detect faces and crop them by extending the boilerplate Greengrass Lambda and pre-built model provided by AWS. This focussed on the local capabilities of the device, but the DeepLens device is much more than that. At its core, DeepLens is a fully fledged IoT device, which is just one part of the 3 Pillars of IoT: devices, cloud and intelligence.

All code and templates mentioned can be found here. This can be deployed using AWS SAM, which helps reduces the the complexity for creating event-based AWS Lambda functions.

Sending faces to IoT Message Broker

AWS DeepLens Device Console Page

When registering a DeepLens device, AWS creates all things associated with the IoT cloud pillar . If you have a look for yourself in the IoT Core AWS console page, you will see existing IoT groups, devices, certificates, etc.. This all simplifies the process of interacting with the middle-man, the MQTT topic that is displayed on the main DeepLens device console page. The DeepLens (and others if given authorisation) has the right to publish messages to the IoT topic within certain limits.

Previously, the AWS Lambda function responsible for detecting faces only showed them on the output streams and was only publishing to the MQTT topic the threshold of detected faces. We can modify this by including cropped face images as part of the packets that are sent to the topic.

The Greengrass function below extends the original version by publishing a message for each detected face. Encoded cropped face images are set in the “image_string” key of the object. IoT messages have a size limit of 128 KB, but the images will be well within the limit and encoded in Base64.

Save faces to S3 with an IoT Rule

The third IoT pillar intelligence interacts with the cloud pillar, which uses insights to perform actions on other AWS and/or external services. Our goal is to have all detected faces saved to an S3 bucket in the original JPEG format before we encoded it to Base64. To achieve this, we need to create an IoT rule that will launch an action to do so.

IoT Rules listen for incoming MQTT messages of a topic and when a certain condition is met, it will launch an action. The messages from the queue are analysed and transformed using a provided SQL statement. We want to act on all messages, passing on data captured by the DeepLens device and also inject the “unix_time” property. The IoT Rule Engine will allow us to construct statements that do just that, calling the timestamp function within a SQL statement to add it to the result, as seen in the statement below.

The action is an AWS Lambda function (seen below) that is given an S3 Bucket name and an event. At a minimum, the event must contain properties: “image_string” representing the encoded image and “unix_time” which used for the name of the file. The last property is not something that is provided when the IoT message is published to the MQTT topic but instead is added by the IoT rule that calls the action.

Deploying an IoT Rule with AWS SAM

AWS SAM makes it incredibly easy to deploy an IoT Rule as it is a supported event type for Serverless function resources, a high-level wrapper for AWS Lambda. By providing only the DeepLens topic name as a parameter for the template below, a fully event-driven and least privilege AWS architecture is deployed.

TechConnect recently acquired two AWS DeepLens to play around with. Announced at Re:Invent 2017, the AWS DeepLens is a small Intel Atom powered Deep Learning focused device with an embedded High-Definition video camera. The DeepLens runs AWS Greengrass, allowing quick compute for local events without having to send a large amount of data for processing on the cloud. This can substantially help businesses reduce costs, sensitive information transfer, and latency response for local events.

Zero to Hero (pick a sample project)

Creating a face detection AWS DeepLens project

What I think makes the DeepLens special, is how easy it is to get started using computer vision models to process visual surroundings on the device itself. TechConnect has strong capabilities in Machine Learning, but I myself haven’t had much of a chance to play around with Deep Learning frameworks like MxNet or Tensorflow. Thankfully, AWS provides a collection of pre-trained models and projects to help anyone get started. But if you are quite savvy already in those frameworks, you can train and use your own models too.

Face Detection Lambda Function

AWS DeepLens Face Detection Lambda Function

An AWS DeepLens project consists of a trained model and an AWS Lambda function (written in Python) at its core. These are deployed to the device to run via AWS Greengrass, where the AWS Lambda function continually processes each frame of coming in from the video feed using the awscam module.

The function can access the model that is downloaded as an accessible artifact on the device at the path. This location and others (example: “/tmp”) have permissions granted to the function from the AWS Greengrass group which is associated with the DeepLens project. I chose the face detection sample project, which processes faces in a video frame captured from the DeepLens camera and draws a rectangle around them.

Extending the original functionality: AWS DeepLens Zoom Enhance!

AWS DeepLens Face Detection Enhance

I decided to have a bit of fun and extend the original application functionality by cropping and enhancing a detected face. The DeepLens project video output set to 480p definition, but the camera frames from the device are much higher than this! So reusing the code from the original sample that drew a rectangle around each detected face, I was able to capture a face and display that on the big screen. The only difficult thing was centring the captured face and adding padding, bringing back bad memories of how hard centring an image in CSS used to be!