Victor - Part 2

Jan 3, 2017 - 12:52 am

But of course, that isn't the end. Not even close. In the beginning of fall term of junior year, my computer was once again right next to the TV, and my levels of laziness had reached an all time high. I once again started to think of ways that I could control Spotify, Netflix, and my movies from the comfort of the futon (note: it was approximately three feet away from the computer). I pulled out the Kinect again, but remembered the problems that it had last time. I couldn't talk to it if I wasn't there, the speech recognition API didn't allow for a lot of commands, and it relied on something that was frankly, poorly supported and had next to no online documentation. So it was time for a change.

So I needed a new input device. What do I practically always have with me? My phone. AND it's jailbroken. I did some searching around, and found Assistant+, this great tweak for only $2 that lets you create custom Siri commands and capture groups (stuff like "Look up [movie]") that work with Activator commands. By writing a few shell scripts that can handle some parameters, and Activator's "Run Command" feature, I could run practically any code through a voice command to Siri. But I needed it to be able to affect my computer. Time for my favorite/least favorite part of coding: glue.

I set up a MySQL database, and a PHP script on my website that could take a POST request, and create a 'task' in a table. This allowed me to keep track of all of the commands that I make, do optional pre-processing on the task requests, and I can keep track of all of the commands/what level of completion that they're at all in one place. This was a lot of foresight for no functionality, but designing this like this was a godsend later.

I opened up the C# code that I had originally for Victor, and stripped out all of the Kinect APIs. Instead, I put in a ping to the server to check for any uncompleted tasks. Then, I began to figure out what I wanted it to do, and how to use C# to interact with my computer, and the surrounding systems. First I wanted to control Spotify, as unfortunately Siri can't do that. Luckily, someone has written a C# API to talk to the underlying Spotify web player, so it was pretty straight forward to code in pause, play, next, back controls through voice. For trying to play playlists/artists/albums with a voice command, I intercepted the task request in the PHP script, used Spotify's search API, and then added it into the database with the spotify:uri: format for easy use.

Then, I wanted to work with Netflix, and my local movie collection. My local movie collection was relatively easy, as it was already incredibly well formatted and organized for my movie displaying code, so I could easily search titles with C#'s 'dir' capabilities. For Netflix, I used Selenium and Chrome to be able to navigate to Netflix, login, search for any string, play, pause, and quit. At this point, I had some pretty good control over my computer with my phone, and it was the end of the fall term: time for winter break.