Menu

Hacking the Google AIY Voice Kit – Part 1 – Putting it Together

Voice controlled devices have been a fascination of mine ever since watching the classic movie 2001 – A Space Odyssey. The Google AIY Voice Kit allows experimenters and makers to build a device that operates a lot like the movies computer HAL, it even has a glowing light that reacts to spoken commands (it does not, however, read lips – at least not yet!).

I’m excited about this wonderful technology and plan on hacking it as much as possible. But before I hack it I need to build it. Follow along and build one too!

Introduction

The Google AIY Voice kit is an incredible piece of technology that allows you to experiment with the Google Assistant voice recognition API on a Raspberry Pi 3. Originally offered as a gift to subscribers of the Raspberry Pi’s official magazine MagPi the kit is now available to anyone (see details for purchasing below).

With this kit and the addition of a Raspberry Pi 3 you can construct a small cardboard cube that is capable of natural language recognition, similar to the Google Home devices. Since the kit is powered by a Raspberry Pi you can then add additional components and peripherals to design and build voice-activated products of your own.

In this article we will put together the kit and test it out. Subsequent articles in the series will show you how to interface the kit with other components and how to change the “wake word” so that your project is not limited to recognizing the “Hello Google” phrase to initiate voice controlled commands.

So let’s get building!

Where to buy the Google AIY Voice Kit

The original kit was only offered as part of a subscription to MagPi but the new and improved version is available to anyone. As of this writing the Google AIY Voice Kit can be purchased from the following distributors:

Note that the kit does NOT ship with a Raspberry Pi 3 so you’ll have to pick one up if you don’t already have a spare one. The Kit requires the Pi 3 although there is limited functionality available with Raspberry Pi 2 and Raspberry Pi Zero W boards.

What’s inside the Google AIY Voice Kit?

The kit comes attractively packaged in a flat cardboard box with a white sleeve. Opening the box will reveal its contents to be:

The Voice HAT board. This is the heart of the project and will be the subject of the hacking we will do in the next article.

The Microphone board. This board has a stereo microphone array, note that the microphones are mounted so that the sound is captured from the underside of the board.

A 3-inch speaker. This is a standard speaker with connecting wires attached.

A Push Button. This Is an “arcade style’ push button with an integrated LED.

Enclosure and frame. These two pieces are constructed of cardboard. They are marked with fold numbers for easy assembly.

Connecting cables. Two are included, one for the microphone board and the other for the connection to the push button.

Mounting hardware. Two plastic standoffs to support the Voice HAT to the Raspberry Pi. Also a mounting nut for the push button.

What is NOT included but required are the following:

A Raspberry Pi 3. This should require no introduction!

A Micro-SD card. This holds the image for the special version of the Raspbian operating system for the Raspberry Pi. Get a good (Class 10) card with at least 8GB of space.

A Power supply. A 5-volt supply with a micro USB connector for the Raspberry Pi. You’ll need one that can supply at least 2 amperes of current.

You will also need a USB mouse and keyboard ands an HDMI monitor to set things up. A set of needle-nose pliers and a Phillips #00 screwdriver are the only tools you’ll need. Some double-sided or Scotch tape will be used to hold the microphone board to the inside of the box – the instructions also suggest the use of hot glue but since I plan on disassembling this later to hack it I chose to use Scotch tape.

Once you have gathered together all of the parts it’s time to construct your AIY Voice Kit.

Putting it all together

Start by attaching the two plastic standoffs to the Raspberry Pi 3. They go onto the same side of the board as the HDMI and Power connectors, opposite the side with the 40-pin GPIO connector. I found these to be a bit hard to put in by hand and enlisted the help of my needle-nose pliers.

Next you attach the Voice HAT (Hardware Attached on Top) to the Raspberry Pi. It’s fairly obvious how this fits, the standoffs snap into the front and the 40-pin GPIO connector holds the back. Make sure that you don’t bend andy of the pins while inserting the HAT onto the Pi.

Connect the speaker to the blue terminal strip on the Voice HAT. The red wire goes to the positive terminal, black tpo negative (although I’m sure it would work just fine if you happen to reverse them). Once the wires are in place use the Phillips #00 screwdriver to secure them.

Now we connect the push button cable to the HAT, it’s the one with 4 wires. The plastic connector goes to the Voice HAT, it is keyed so that it can only be inserted one way. Leave the other end (with the four spade terminal connectors) unconnected for now.

Grab the microphone board and the remaining 5-wire cable and mate them. As with the previous cable the plastic connector is keyed to only fit one way. Both ends of the cable are identical and it doesn’t matter which end you connect to the microphone board.

Now connect the other end of the microphone cable to the Voice HAT board.

This completes the assembly of the hardware. Now it’s time to fold some cardboard!

Origami, Google Style!

The Google AIY Voice Kit is constructed out of cardboard and consists of two parts:

An outer shell. This has cutouts for the push button and holes to allow the sound from the speaker to be emitted.

An inner frame. This holds the speaker to the Raspberry Pi and Voice Hat.

The cardboard pieces are marked with assembly and folding steps so it would be pretty hard to get it wrong. Here is how it is assembled:

First we build the box. It’s really simple, consisting of four folds marked “Fold 1” to “Fold 4”. Make the first three folds in order.

When you get to Fold 4 insert its tab beneath Fold 1 to secure the box.

Now onto the frame. This one’s a bit more challenging but it’s still pretty straightforward. Start by folding the two flaps labeled “1” and “2” along their creases.

There is a “U-Shaped” cutout on the larger flap on the top of the frame, it is designed to hold the speaker. Push that flap out now.

Now fold the rest if the large flap out, bending along the creases. Note that there are two folds here, one at the junction of the base and the other one on the top.

Time to fit the speaker into the frame. The speaker magnet slides into the “U-shaped” cutout that you worked with in the last two steps. The speaker wires should face up when it is inserted.

And now place the Raspberry Pi and Voice HAT assembly on the bottom of the frame. Note that side flap 2 has an extension that mounts behind the USB and Ethernet connectors on the Pi.

We are ready to mate the frame to the box! Slide the frame into the box, making sure that you have the box oriented so that the speaker holes are in the right direction. It will slide into place quite easily, if you encounter more than a little resistance then stop and make sure your Raspberry Pi has not moved out of place or that a cable has not been caught between the box and the frame.

Once it is all in place press down on the Raspberry Pi so that its connectors are exposed through the cutouts in the frame and box.

Now slip the push button switch into the hole on the top of the box.

Use the large plastic nut to fasten the pushbutton to the box. You don’t need (and shouldn’t use) any tools for this, just hand tighten it.

The next step is to connect the wires on the switch cable to the push button. These press-fit onto the push button spade connectors. It is essential that you get this correct as you could potentially damage the Voice HAT if you don’t. Orient the push button so that the ”crown shape” logo is right side up and connect the wires as follows:

Blue – bottom left (this is the LED negative)

Red – bottom right (this is the LED positive)

White – top left (this is one end of the SPST switch)

Black – top right (this is the other end of the SPST switch)

We are ready to attach the microphone board. I held mine in place with a couple of pieces of Scotch tape but double-sided tape or hot glue would work too. Keep in mind that you may want to pull this apart later, for that reason I chose to use tape. No matter what you choose make sure that the two microphone elements are aligned with the small holes in the hbox or your Voice Kit will not be able to hear you!

Make one last check to see that everything is held in place and that your connectors are still attached. Then fold the top of the box.

We are done! You have now assembled the Google AIY Voice Kit and you are the proud owner of a cardboard box that you can talk to. Perfect for those lonely evenings in the workshop!

Of course that just completes the hardware assembly, your box needs an operating system and some software to bring it to life. Let’s take care of that now.

Software for the Google AIY Voice Kit

In order to get your Voice Kit up and running you will need to do a few things:

Install an Operating system

Setup the Google Assistant API

Run a Python script

We will now examine how you accomplish those three tasks, starting with the operating system.

Getting the Raspbian Operating System

Although the Voice HAT is capable of being used with many different operating systems the best choice for the Google AIY Voice Kit is a custom build of the Raspbian operating system, a LINUX-based OS that is commonly used with the Raspberry Pi.

To make you life easier Google has provided an image of this custom OS that you can download and install onto your Micro-SD card. So the first step you’ll need to take is to get the latest build of the Voice Kit Image using your computer – LINUX, Windows or Mac. If you are on a slow Internet connection be patient as the image weighs in at almost 1 gigabyte.

Once you have your image you’ll need to write it to the Micro-SD card. There are many ways of doing this, one of the best (and the one recommended by Google) is to us a free open source product called Etcher.

Etcher is available for LINUX, Windows or Mac and can be downloaded from Etcher.io Gpo grab a copy and install it onto a computer that has an SD card slot (i.e. just about every notebook computer made today).

If your computer lacks an SD card slot you can get an inexpensive adapter that connects to your USB port. With most computers and many adapters you’ll also need an adapter that converts a Micro-SD card to a full-sized SD card, most Micro-SD cards come with one.

With Etcher running and your blank Micro-SD installed you are ready to transfer your image file. Using Etcher couldn’t be simpler. First you select the image you downloaded. Next you select the Micro-SD card – chances are that Etcher has already selected it for you but it is super important to verify that it got it right to avoid wiping the wrong external storage device!

When all has been selected you are ready to burn the card. Click Etcher’s “Flash” button. On LINUX systems you’ll be prompted for an administrator password, on Windows and OSX you’ll go right to the flashing process. Etcher will display a progress bar while it flashes and then verifies the data.

Once it’s done Etcher will display a status message. You can then eject and remove the Micro-SD card and insert it into the Raspberry Pi 3 card slot, which is exposed through one of the cardboard cutouts on the AIY Voice Kit.

While you are at it hook up a keyboard and mouse to the Raspberry Pi’s USB ports. Attach an HDMI monitor to it as well. Finally attach the 5-volt power supply to the device.

If everything is working correctly you should see a power light on the Raspberry Pi 3 illuminate, it’s located very close to the Micro-SD card slot and the power connector. The HDMI monitor will display the boot process, the first time you boot you’ll get a quick message about the volume resizing itself followed by a reboot.

Once the boot process is complete you’ll see the custom Raspbian desktop with the “AIY cardboard” background. Verify that your mouse and keyboard are working.

If all is well then it’s time to get “on the air”!

Setting up your WiFi connection

If you don’t have a WiFi system available you could use an Ethernet cable to attach your Google AIY Voice Kit to the Internet. However most people will want to use WiFi for its versatility.

The Raspbian operating system desktop has an icon on the top right corner for WiFi, it’s between the Bluetooth and Speaker icons. When you first start it will have a couple of red crosses indicating that there is no WiFi Connection.

Click on the icon to display the list of available WiFi networks. If you don’t see any be patient and try again in about a minute.

Once you find your desired WiFi network click on it to connect. You’ll get a text box to enter your WiFi access code, keep in mind that this will be displayed in clear text while you enter it. Assuming you entered your code correctly you’ll be connected and the WiFi icon will now display the signal strength.

To verify that you’ve connected correctly Google has provided a script to test your connection, you’ll find it right on your desktop. Click the “Check WiFi” icon and a terminal box will open and run a test connection to the Google servers. If everything is alright it will display the message “The WiFi Connection seems to be working”. You can close the box by hitting Enter on your keyboard.

Testing the Audio

There is another script that you can run now to make sure that your Voice HAT is functioning properly.

On the Raspbian desktop you’ll find another icon labeled “Check audio”. Click it and a terminal window will open and will display the text “playing a test sound…”. If all is well you will hear a female voice saying “front center” playing through the speaker. You can then answer “yes” (“y”) to the question asking if you heard the sound.

You are then prompted to hit Enter and say “Testing 1-2-3” (you could of course say anything you want to). After speaking the test phrase your voice will be played back to you. Again you need to answer the question regarding hearing your own voice by pressing the letter “y”.

Finally you just hit the Enter key to close the hscripot. Your Voice HAT appears to be functional.

Now you’re ready for the next step, setting up the Google Assistant API.

Setting up the Google Assistant API

Google Assistant is Google’s fusion of Machine Learning, Speech Recognition and Language Understanding. Essentially it allows you to have a verbal conversation with Google and receive voice answers. The most well known use of Assistant at the moment is on the Google Home devices but it is also available for selected Android phones. Eventually it will be available almost everywhere.

The Google Assistant API allows developers to leverage the amazing capabilities of Google Assistant for their own products. Think voice controlled robots and IoT devices.

We will be making use of the Google Assistant API in the Google AIY Voice Kit. In order to do so we will need to do the following:

Login to the Google Cloud Console

Create a Project that uses the Google Assistant API

Download a JSON file that contains the connection information our application will use to interact with the Google Assistant API

Let’s go through those steps in detail. You’re going to need a Google account, which for most people will be a GMail address, so if you don’t have one go out and get one (it’s free). You will also need to use that account to sign up for Google Cloud Services.

Note that while Cloud Services are free to sign up for some activities will incur billing. Nothing we are doing with our setup of the Google AIY Voice Kit will cost anything, however some of the experiments we will perform in the subsequent articles do have a cost to them. There is an amount that is allocated for free (per month) so you can do everything without charge, if you do get charged Google will inform you.

Again, nothing we are about to do will cost anything. So get your GMail account ready and perform the following actions on your Raspberry Pi using the Chromium Web Browser included in Raspbian:

Enter a name for your product (i.e. “My Voice Demo”) in the “Product Name Shown to Users” box.

Click the “Save” button

You will be returned to the “Create Client ID” screen. There will be a set of radio buttons to choose Application Type.

Select the “Other” radio button. A text box will appear to enter a name

Enter any name you wish (i.e. “AIY Voice Kit”).

Click Create

A dialog box will appear with your OAuth credentials. You do not need to record these

Click OK to close the OAuth Credentials dialog box

The OAuth 2.0 Client ID’s will be listed on the Credentials screen

Click the Download arrow to download a JSON file. You will see the download progress on the bottom of the Chromium browser (in the browsers taskbar)

When download has finished click the arrow on the hdownload indication and select “Open Location”. This will display the file manager with the JSON file you just downloaded in the “Downloads” folder. Minimize the web browser but do not close it.

Login with the same Google (i.e. GMail address) account you used for the Cloud Platform dashboard

Scroll down the screen and activate the following permissions (some may already be activated) using the “slide switch”. Each time you set a switch “on” you will get a confirmation dialog that you need to answer yes to:

Web and App Activity

Device Information

Voice and Audio activity

You are now ready to work with the Google Assistant API. You may close the web browser and file manager now.

So now we have a project on the Google Cloud that contains the Google Assistant API and is associated with our Google Voice Kit. The JSON file we downloaded has all the connection information we need to “glue” our project to the project.

We are ready to start talking to Google!

Demo Programs

To get you started Google has provided three Python demo programs that can test out your AIY Voice Kit. We will use two of them today, in the next article (and associated video) we will use the remaining one.

Assistant Library Demo

The first program we’ll be running is the Assistant Library Demo. This program essentially turns your Google AIY Voice Kit into a cardboard version of Google Home.

To begin click on the “Start dev terminal” icon on your Raspbian desktop. A terminal window will open up, already set to the correct drive path.

At the command prompt in the terminal window type the following: src/assistant_library_demo.py .

When you first run this the Chromium web browser will open up and a screen will display asking which Google account you want to use. Select the same account that you used to setup the project and Google Assistant API in the Google Cloud.

The screen will display a permissions screen which outlines the authority that you’re giving to the application. Approve it and then close the browser.

Once this is done the Assistant Library Demo will start running. If you look at the push button on the AIY Voice Kit you’ll see the LED pulse every few seconds. This indicates that the script is running and that everything is working.

Now talk to the box! Start your query with “OK Google” or “Hey Google”, followed by your question. This is the “wake word” and you can confirm that the AIY Voice Kit is acknowledging it by observing the LED in the push button – it will illuminate when it recognizes the wake word.

The box should now speak the answers to your queries, just like a Google Home device!

You can keep talking to the box until you grow tired of it. When you finish you can press “Ctrl-C” on your keyboard to stop the program.

Leave the terminal window open so you can run the next demonstration.

GRPC Demo

The next program tests the GRPC, which are the Google Remote Procedure Calls. In this script you’ll use the push button instead of the “wake word” to initiate your query.

In the terminal window type the following: src/assistant_grpc_demo.py

As you have given permission in the last script you won’t be prompted again. As with the last script you’ll observe the LED in the push button glowing every few seconds.

To make a query press the push button. The LED will glow constantly and you can now ask your question, without the “OK Google” or “Hey Google” wake word.

Again you can press “Ctrl-C” on your keyboard to stop the program.

What’s next with the Google AIY Voice Kit?

Obviously there is a ton of potential with this device. The power and versatility of the Raspberry Pi combined with the Voice HAT is truly mind boggling.

In the next article and video in this series I will show you how to attach devices to the Raspberry Pi and Voice HAT and write programs to control them with your voice. To make the best use of the hardware I’ll be pulling it out of its cardboard box and mounting it on an experimenters breadboard.

You probably can think of a few projects already that would benefit from voice control and voice recognition. Keep those thoughts in mind and I’ll see you in the next article.