Algorithmia Blog - Deploying AI at scale

February 24, 2016

Hey Zuck, We Built Your Office A.I. Solution

Like many, we were pretty inspired by Mark Zuckerberg’s 2016 personal challenge to build some artificial intelligence tools to help him at home and work. We spend a lot of time at Algorithmia helping developers add algorithmic intelligence to their apps. So, with a hat tip to Zuck, we recently challenged ourselves to see what kind of A.I. solution we could come up with during a recent internal hackathon.

What We Made

In less than 24-hours, we created an automated front desk A.I. that uses facial recognition to identify and greet our coworkers as they arrive at the office.

We used an Amazon Fire tablet taped to the wall to act as our front desk kiosk, which used the front-facing camera to shoot video. As the user walks up to the tablet, we start sampling frames from the video, which are sent to the CMU OpenFace library to check if there’s a match.

If we’ve seen you before, we welcome you to our office by doing three things: 1) our front desk A.I. announces that you’ve arrived in our Slack channel. 2) Slackbot then sends you a summary of Git commits since the last time you checked in. 3) The office Spotify changes to your favorite song.

If you’re new here, we have you run through a training exercise where you mimic some emojis, and pick your song. The next time you arrive at the office, you’ll be in the system, and ready to go.

The icing on the cake: we built this entirely on Algorithmia, which means we didn’t have to setup or configure servers.

BuildingThe Facial Recognition Service

From the start, our biggest concern was that we needed a facial recognition algorithm that could build an accurate model with as few images as possible, since we didn’t want our users having to train for more than a few seconds. Using the CMU OpenFace library we were able to accomplish this with as few as 10 images, which was perfect for handling our training and facial recognition tasks.

We had just heard about this library, and were eager to test their claim that the update improved recognition accuracy from 76.1% to 92.9% in half the execution time. Although we haven’t done any benchmarking, we were impressed by the anecdotal results, and are looking forward to making the CMU OpenFace library publicly available in the Algorithmia Marketplace as soon as possible. The speed and accuracy could be a game-changer for anybody interested in deep neural network training, and closes the gap from weeks to days for facial recognition.

We created a simple training routine where the user looks at the camera and makes a series of faces. This ensures we capture enough variety of facial expressions for the model. While the user is training, we’re sampling images from the video, labeling them, and getting them ready to process.

We wrapped this entire process in a couple algorithms running on Algorithmia, which operate like microservices. When the user first walks up, the tablet is taking photos of the user, it’s sending them to our FaceClassify algorithm, which continually checks the image with OpenFace to see if we recognize the user. If we recognize you, we send back the UID for the user, and kick off the GetUserData algorithm to retrieve data about you.

The GetUserData algorithm grabs the user’s name and Spotify song URI they selected when they first trained. We then pass the name to the GreeterActions algorithm, which handles both our GitHub and Slack integrations.

Our Greeter Bot for Slack uses an incoming webhook from our app to send a message to the team that somebody has arrived, and are checked in.

Greeter Bot welcomes user to the office via Slack

We then grab all the commits from GitHub since you were last in the office, format them, and pass it to our Slack webhook. The webhook handles both sending a direct message to you with the commit summary, as well as announcing that you’ve arrived in our team channel.

User receiving all the GitHub commits they’ve missed

Integrating Spotify

We wanted the process of choosing your office walk-up music to be simple, intuitive, and fun. So we created a choose-your-own adventure flow:

The user first selects a genre of music, and then chooses between “happy,” “dancing,” or “celebration” music. The user is then presented with three songs that matched the genre-mood.

To get the music playing, the first thing we needed to do was create a service that could take requests for tracks and play them. We landed on Pi MusicBox which is a free, headless audio server based on Mopidy for the Rasberry Pi.

PiMusicBox running Spotify on the Raspberry Pi

Getting Pi MusicBox up and running was straightforward, but we realized that it didn’t have an officially documented API endpoint we could hack on – it’s intended to act more like a replacement for Sonos that let’s you stream music from Spotify, Google Music, SoundCloud, Webradio, Podcasts, and more. So, we had to reverse engineer it.

The first thing we noticed was that all communication was handled with websockets. This controlled the various functions like play, pause, change song, etc. Once we figured out the pattern, it was as easy as setting up another microservice on Algorithmia to pass this information through:

This passes in the Spotify URI, connects to the Raspberry Pi, and the song plays on the music in the office stereo.

Conclusion

We’re pleased with how quickly we could create and stack together serverless microservices to power our automated front desk A.I. We’ll be adding the CMU OpenFace library to the platform soon, which will enable all kinds of interesting use cases for app developers – including Zuckerberg’s.

Interested in building your own?Sign-up here to get started with Algorithmia.

We have some cleanup to do on the code before we’re ready to share the sample app with everybody, but in the mean time we built this hack using the following technology: