Backyard Birder

Inspiration

I live in an area that is visited by many a variety of birds (and squirrels). I thought it wold be interesting to explore the abilities of Amazon's DeepLens to be able to identify bird species by both visual and song characteristics as well as keep track of how many squirrels disrupt the feeders each day.

I had a difficult time finding good seed image libraries for bird species and wanted to get my hands dirty with this new hardware. I thought I could leverage the ease of getting a sample project running on DeepLens to do a course collection of bird images that I could further develop into my species identification model.

What does the application do and goals of the project

The AWS DeepLens hardware allows locally running image processing to occur using models trained using a variety of machine learning methodologies. The graphics processing hardware allows for low latency video stream processing at the edge (i.e. disconnected from the cloud).

My application, using an included sample MXNet trained model (deeplens-object-detection), identifies the visitors to my bird feeders in my yard. The sample model is able to identify birds with reasonable accuracy. There is no classification for squirrels in this model but they score relatively likely against the "cat" and "dog" classifier (this is a hackathon, isn't it?). Upon a high identification probability score of recognizing a bird or a squirrel, a message is published to a message topic (MQTT) where a lister process will tally the counts and store them to a database where a daily squirrel vs. bird scorecard can be accessed via the web.

This trigger could be extended to take some physical action to deter the squirrel from the feeder, however I am less concerned with that and more interested in collecting positive bird species samples for a more specific bird model.

Initially I utilizes a pre-trained model, but this is the first step in a broader project that is a "train the trainer" application. Using existing coarse models as the source for generating new seed images to train a new model using AWS SageMaker, in this case, to specifically identify bird species.

Inside of Elastic Beanstalk, a Node.js web server hosts a page with the day's tally of bird and squirrel sightings stored on DynamoDB.

Deep Lens Introduction

Chances are this is your first look at AWS DeepLens so I wanted to cover a quick intro to the hardware and the AWS components used in this project.

Amazon DeepLens - What Is It?

According to Amazon, DeepLens is "the world’s first deep learning enabled video camera for developers."
From Amazon's DeepLens Portal: AWS DeepLens Portal

AWS DeepLens is a wireless video camera and API that you can use to learn how to use the latest Artificial Intelligence (AI) tools and technology and develop your own computer vision applications. Use AWS DeepLens to get hands-on experience using a physical camera that runs real-time computer vision models, examples, and tutorials. Get started with AWS DeepLens by using any of the pretrained models that come with your device. As you become proficienct, you can develop, train, and deploy your own models.

AWS DeepLens Hardware and Framework

The AWS DeepLens camera, or device, uses deep convolutional neural networks (CNNs) to analyze visual imagery. You can use the device as a development environment to build computer vision applications. The device includes the following:

A 4 megapixel camera with MJPEG

8 GB of on-board memory

16 GB storage capacity

A 32 GB SD card

WiFi support for both 2.4 GHz and 5 GHz standard dual-band networking

A micro HDMI display port

Audio out and USB ports

AWS DeepLens is powered by an Intel® Atom processor, which is capable of processing 100 billion floating point operations per second (GFLOPS). This gives you all of the compute power that you need to perform inference on your device. The micro HDMI display port, audio out, and USB ports allow you to attach peripherals so you can get creative with your computer vision applications.

The AWS DeepLens device uses deep convolutional neural networks (CNNs) to analyze visual imagery. You use the device as a development environment to build computer vision applications.

AWS DeepLens works with the following AWS services:

Amazon SageMaker, for model training and validation

AWS Lambda, for running inference against CNN models

AWS Greengrass, for deploying updates and functions to your device

AWS DeepLens is ready to use right out of the box. After you register AWS DeepLens, deploy a sample project, and begin using it to develop your own applications computer vision applications.

How It Works:

The following diagram illustrates how AWS DeepLens works.

When turned on, the AWS DeepLens captures a video stream.

AWS DeepLens produces two output streams:

Device stream – the video stream is passed through with no processing.

Project stream – the results of the model's processing video frames.

The Inference Lambda function receives unprocessed video frames.

The Inference Lambda function passes the unprocessed frames to the project's deep learning model where they are processed.

The Inference Lambda function receives the processed frames back from the model and then passes the processed frames on in the project stream.

Hardware - Model AMDC-1

CPU

Intel Atom® Processor

MEMORY

8GB RAM

OS

Ubuntu OS-16.04 LTS

BUILT-IN STORAGE

16GB Memory (expandable)

GRAPHICS

Intel Gen9 Graphics Engine

POWER SUPPLY

5V - 4A

Prerequisites and Setup

You will need the DeepLens hardware to build this project.

You will need an AWS account

Getting Started - Device Registration

Amazon has a really good portal for the initial setup required to prepare your DeepLens and connect it to your network.

Before using AWS DeepLens, you must register your device, connect it, set it up, and verify that it's connected.
The following graphic shows where you perform each step:

If you would like to try to replicate my project here are a few steps to follow:

Configure AWS

I am assuming you have some experience of working with AWS, I am including links to help articles to each of the tasks. All services should be configured in the same VPC for security and roles management simplification.

the topic name "deeplens_xxxxxxxxxxxxxxx" will be obtained from your deployed project, it is the bottom of the deployed project page - so this is a little "chicken/egg" scenario: you setup the rule first and just put in dummy field, then come back and update the rule after the project is deployed.

Offline Documentation

Challenges I ran into

Currently the SageMaker only supports training new models with Apache MXNext, I have had a little experience training model using Tensor Flow and was hoping to import my existing models. It was a good exercise for me looking at how MXNet works and expand my scope of ML frameworks. I look forward to the addition of future ML Frameworks to the DeepLens platform so I can continue trying them out to find the best options for the job.

Accomplishments that I'm proud of

Even though I was able to quickly get up and running using the sample models, I was happy I took the extra time building out my own model using SageMaker and Jupyter notebooks. Python is not my strongest language and working with jupyter notebooks made it a little easier to work through the process. I recommend after you get a couple of the sample models up and running you take the time to train your own model as that really extends the value of this platform.

What I learned

It was interesting to explore the Apache MXNet framework, I had previously used Tensor Flow and I like to make sure I am keeping on top of multiple options when considering technologies. Apache MXNet is a very mature project and active user community.

I learned how this customized hardware and software integration allows for very fast deployment of deep learning methodologies and a quick learning tool for developers to begin exploring the technical area.

I was also very happy to find the extensive documentation that was put together by the AWS DeepLens team. It was a very well documented step by step that was helpful when digging into a totally new platform.

What's next for my project

This first pass of my project is great for weeding out the birds from the squirrels, but I am interested in extending the model to do more detailed identifications. The images captured during the first phase of the project will be a good resource as I extend the training to more specifically identify bird species.

As I mentioned above, currently the SageMaker platform only supports MXNet models, I am interested in porting over my Tensor Flow trained models. Here is a link to my git project with my training information for bird species identification using Tensor Flow.

I am also very interested in seeing if I can adapt the hardware to include bird songs as part of the identification criteria.