Often times in the evening, I come across a random television commercial that has nothing to do with my interests and it totally missed the mark. Several weeks back, I began thinking, what if there was a way for advertisers to be able to dynamically adjust content based on how I was feeling? What if there was a way for advertisements to be targeted based on your mood?

In 2017, Adobe hosted a panel of experts on the “Future of Advertising” during the Advertising Week conference in New York City. The major area of focus for the panelists was discussing the future impact of targeted advertisements and personalization. Kelly Andresen, Senior Vice President at USA Today said, “Agencies, brands and publishers alike, will all have teams of psychologists on staff to really take on understanding of what is that human connection, what is the emotion.” The panel agreed by 2022 media firms will need to rely on artificial intelligence, specifically facial recognition to provide sentiment analysis that deliver advertisements that resonate with their target audiences.

“Sentiment analysis aims to determine the attitude of a speaker, writer, or other subject with respect to some topic or the overall contextual polarity or emotional reaction to a document, interaction, or event.” —Wikipedia

Through extensive research, Netflix has found that their users on average spend 1.8 seconds when considering what content to watch. This year, they are on track to spend upwards of $8B on content. If their users are truly judging a book by its cover then they’ll need a creative solution to ensure their content is being properly showcased. Today, they use sentiment analysis via computer vision to determine which image from their content is most likely to resonate with their audience when choosing content in the Netflix user interface. This spring, I gave a talk about sentiment analysis and how AWS can help customers address this requirement.

In this blog, I am going to share with you some ways to work with an off-the-shelf camera to capture sentiment near real-time.

Solution Components

Amazon Rekognition

Amazon Rekognition is a fully managed AWS service that has been developed by our computer vision scientists that allows you to extract contextual metadata from a video or image. It is extremely easy to use and it provides scene detection, facial recognition, facial analysis, person tracking, unsafe content detection, and much more. We will be using Rekognition to provide sentiment analysis for our content.

Camera Capture

Amazon Rekognition requires an image or a video asset to begin extracting metadata. The camera you use to perform this job is ultimately up to you but I was looking for a camera with simple setup and a fully documented REST API, like AWS DeepLens. For the sake of this post I am going to be using the Amcrest ProHD 1080P (~$70 USD on Amazon.com) for the ease of setup (not covered in this post) as well as this camera has a fully documented REST API to help capture content and other interesting features such as motion detection, audio output, and the ability to capture video to a file share. [Editor’s note: Links provided for convenience and should not be construed as an endorsement of Amcrest]

Code

To simplify interacting with both the Amcrest camera and Rekognition, I’ve gone ahead and written my own custom Python classes that ultimately make interacting with the camera less than 10 lines of code.

The Rekognition confidence score is a percentage, 0-100%, that indicates accuracy. The Rekognition API allows you to pass an additional parameter that will set the confidence threshold to something you are more comfortable with.

How It Works

The Amcrest camera I am working with is not hooked up to the Internet, it is effectively an internal IoT device that I am connecting to. There are several ways to solve this connectivity problem. I chose to use Amazon Simple Queue Service (SQS) and a local docker instance on my laptop which resides on the same network. I also explored using a Raspberry Pi Zero W board to act as my local compute platform but once the small Python application I wrote to poll SQS was written and placed inside of the Docker container it no longer mattered where it was being executed from.

I’m using an AWS IoT 1-Click Enterprise Button to kick off my workflow. Every time the button is clicked, it will publish a message onto SQS via a Lambda function that I have created. From there, my small Python app running in the Docker container sees the message and issues a REST API call to the Amcrest camera. The Amcrest camera takes a picture, and I send the data off to S3 for storage. The local docker instance calls out to Amazon Rekognition to capture my current sentiment and subsequently stores that metadata into an Amazon DynamoDB table. Finally, that emotional sentiment is captured and sent off to our speech to text service, Amazon Polly, which uses a synthetic voice to tell you how you are feeling via the camera’s built in speaker.

Additionally, you could certainly adjust this architecture to use an Alexa device instead of an AWS IoT button to call the same API.

What’s Next?

Looking to the future, the ability to leverage real-time sentiment analysis for advertising and audience feedback are growing use cases.

Out of Home advertising (OOH) is a $7.7B market which is focused on billboards and other forms of advertisements such as dynamic content you may see within an airport or train station. Internally, we describe this use case as smart billboards that present content to customers based on who they are and what their current emotional state is. Additionally, by capturing sentiment, a media company could use this information to insert an advertisement that will resonate with that viewer based on how they are feeling.

One could imagine leveraging this technology to provide feedback to a public speaker when they are delivering their content. Imagine at a re:Invent, for example, if the speaker could know what section of the audience is enjoying his or her content based on a visual cue that could be shown on the comfort monitors colored coded to match the level of engagement (green, yellow or red). Based on the color, the speaker could adjust their presence on stage or even perhaps their voice inflections to better resonate with the audience.

re:Invent Presentation with comfort monitors

Wrap Up

This how-to post can help developers get started on the mechanics around capturing images or video for sentiment analysis. As you look to put into production, you will want to expand the concepts here to, for example, integrate with Amazon Kinesis Video Streams for more real-time analysis.

Advertising, deeper audience engagement, and meaningful customer interactions will continue to get better with artificial intelligence. Sixteen years ago, Steven Spielberg considered what advertising in the future could look like…The future is here and it is up to you to build the next great customer engagement platform using technology like Amazon Rekognition.

Paul Roberts

Paul Roberts is a Strategic Solutions Architect for Amazon Web Services. When he is not working on serverless applications, DevOps, Open Source, or Artificial Intelligence, he is often found exploring the mountains near Lake Tahoe with his family.