Announcing Vox

Vox is a login-free voice and text chat platform. I wrote and published it months ago, but haven't had a soapbox on which to talk about it until recently.

How to use it

Vox is a low-friction way to talk to a group of people. There are no installs, no sign-ups, and no prompts except for the browser's request for microphone access.

The basic concept is similar to IRC. If you visit http://vox.r6l7.com/r/demo1, you'll be brought to the "demo1" room. Anyone else who visits that channel will be brought to the same room. This makes it possible to set up a common meeting place based on a topics or shared interests.

If you'd prefer to avoid random strangers, you can pick a room name at random. Or you can visit http://vox.r6l7.com/ directly without specifying a room name, which will redirect you to a new, randomly generated room. Then you can share the link with whoever you want to talk to by other means.

This makes it easy to upgrade an existing text conversation into a voice conversation. All you have to do is visit the site then share the link with someone who has a browser and you're good to go. Modern mobile browsers should work as well just as well as Desktops.

In short: No more exchanging Skype IDs!

Underlying Technologies

The main technical ingredient is WebRTC. There's really no other way I could have written it, so keep that in mind as I rant about the drawbacks.

Support is mixed. Apparently it doesn't work on Safari or IE. I didn't bother to test those browsers since I don't use Windows or iOS, but I acknowledge a lot of others are still stuck on those platforms.

Even where it is supported, there are issues. I spent hours trying to implement a volume slider, but kept getting only silence in Chrome. Eventually I learned that combining WebRTC with WebAudio is currently not supported on Chrome. That's bug 121673. Expect to see a volume slider added some time after that bug is fixed.

All in all, it's very much a web technology. It's easy to build and use, except for the part where the browser vendors decided to each go their own way.

Limitations

The audio traffic is peer-to-peer. This makes the server very scalable. All it has to do is pass around a few control-plane messages and handle the text chat traffic. Clients, on the other hand, would probably have a bad time if the room had any more than a few people.

Scaling on the server side is pretty bad, too. I've been mainly focused on basic functionality, so I haven't gotten around to implementing twelve factor principles. The room state really should be stored in Redis or some other equivalent. For now, the assumption of "one service, one process, one machine" is baked into the code.

See Also

If you like the concept but don't like my implementation of it, check out Talky.io. I didn't know about them when I started the project, but it seems they've built something quite similar, and did it first. They're also much better at web design.