A Year Building a WebRTC Application: Lessons in Startup Engineering

I’ve been an Engineer at Toptal for just about a year now, working on the same WebRTC framework project since I joined the network: Ondello, a service that connects doctors and patients over WebRTC. Think Google+ Hangouts for healthcare.

(WebRTC is a technology that allows for real-time communication through a web browser. The best example, again, is Google+ Hangouts: without any external applet, you get real-time video chat. There’s a good write-up here.)

With this post, I’d like to share the story behind Ondello. Specifically, I’d like to talk about:

How cutting-edge technologies (e.g., WebRTC) are uniquely difficult…

How a simple WebRTC Rails application became not-so-simple, and…

How a hired Ruby developer ended up writing almost no Ruby code.

Cutting-Edge Technologies, and the Problems They Pose

As I mentioned earlier, Ondello is all about the WebRTC platform. But only Chrome, Firefox, and Opera currently support WebRTC calls without the use of a plugin. Our aim was to support all browsers (Safari and Internet Explorer are the obvious omissions to the list above) and, as such, we had to use a plugin which was relatively young.

When I started working with the plugin, things went smoothly as they had very good documentation and official WebRTC tutorials and guides. But as our product expanded and our requirements grew in complexity, things took a turn for the worse. This wasn’t as simple as troubleshooting Rails issues.

Lack of a WebRTC Support Community

As WebRTC is a somewhat recent technology and this plugin in particular is quite new (still invite-only with a reasonably large usage fee), they haven’t attracted a ton of developers, let alone WebRTC Rails developers to write guides and tutorials for people like me. These factors critically damage the developer community around it, which is essentially non-existent.

I’d never considered this to be a major problem—until I missed it. For most app platforms and technologies, the general cases are easy to handle, but making things work in a very specific way under a very specific deadline—that requires intimate knowledge of the technology or access to fellow developers.

Imagine programming in a world without StackOverflow—how badly would your productivity be hampered?

Of course, we had some access to the plugin’s devs, who were as helpful as I could expect. This alleviated some of the issues, but it wasn’t the same as simply Googling “WebRTC plugin foo not working due to bar”.

Who’s at Fault?

Every piece of software out there has bugs.

But in the case of these new WebRTC technologies, it was difficult to know who was the father and who was the child. In other words: when we found a failed test case or apparent bug, it was unclear if we were at fault, or if there was a bug in the codebase we were leveraging.

What could be easily searchable on Google for most technologies became an enormous email chain followed by code reviews and, in some cases, a conference with the WebRTC developers.

What could be easily searchable on Google for most technologies became an enormous email chain followed by code reviews and, in some cases, a conference with the developers.

On that matter, I’d say we had a roughly 50-50 rate of bugs due to the plugin’s dev team and our own.

Frequent Updates to WebRTC Platform

Another consequence of this WebRTC plugin’s relative infancy is the update rate. Younger software is updated more frequently, and these updates are often breaking. With the plugin, we couldn’t continue to use older versions (as a telecom technology, all users had to use the same version to communicate with one another), so every release required refactoring.

These updates were painful. Often, we’d have stable code that had to be completely rewritten to sync up with the latest changes. In some cases, tests would fail without explanation—and if there’s no community, where do you turn for help?

In some cases, tests would fail without explanation—and if there's no community, where do you turn for help?

Issues With WebRTC Mobile Support

As our product grew, most of our clients started to ask about mobile support.

If the WebRTC JavaScript library was difficult, then the Android library was doubly so. It took me two day just to compile the sample application, and more than five hours just to understand how it worked—the whole process gave the word ‘painful’ new meaning.

The WebRTC iOS library was easier, and the dev team had clearly spent more time on it. We were able to put it to work in just a couple of hours.

WebRTC Meets JavaScript

Moving beyond the WebRTC plugin, our JavaScript codebase taught me much about complexity. As most WebRTC frameworks are based on JavaScript for configuration (e.g., Plivo’s web framework for VoIP), it was the only language I used for my first three months at Ondello.

And as our codebase grew, it naturally became less and less manageable. We had deeply nested callbacks, poor modularization, and more—this was a startup’s first take at a WebRTC JavaScript application and, as you might expect, it was a little messy.

Eventually, we transitioned to CoffeeScript. This reduced the size of our codebase and increased its readability significantly, but wasn’t enough to tackle our true complexity problems.

PubSub

Then, we found PubSub, a wonderful publish/subscribe library for JavaScript that I can’t recommend strongly enough. This little guy helped us decouple a significant portion of our JavaScript logic; our refactoring process took a turn for the best with its discovery.

Why was PubSub so useful? WebRTC is inherently related to media and connections, and is thus full of events. With PubSub, we could easily watch for certain events at a high level of abstraction. It was a wonderful solution, and a great example of not solving the solved problem: when you have well-written, well-documented Open Source (or even proprietary) solutions out there that will simplify your codebase and speed up development, use them.

To be more specific: Without PubSub, every time a user’s camera went off, we had to update three checkboxes that represent the camera status:

With PubSub, I could just trigger the “camera_went_off” event and have all three checkboxes listen for it. In addition, if someone clicked on one of these checkboxes, it could also publish this event to notify the other checkboxes to update themselves.

Creating a Single-Page WebRTC Application with AngularJS

As mentioned previously, we were using a WebRTC plugin to support a wider range of browsers. But this plugin has to load whenever a user loads a new page. This can have terrible implications on performance, fluidity, etc.

The only answer was to keep the plugin loaded as soon as the user entered the site, and ensure that the user didn’t reload or enter a different web page. This sounds impossible. But in fact, it can be accomplished with a single-page application (SPA).

With Angular, we transitioned to an SPA. When a user entered the site, they loaded the plugin once—and only once. The performance gains were instantaneous.

Angular allowed us to build a SPA, which in turn allowed us to load the plugin once (when the user entered the system) and only once. Then, any call made by the user would be almost instantaneous, as the plugin would already be loaded. This took a lot of work, but it paid off—the performance gains were instantaneous.

Angular also improved our JavaScript code as a whole by enforcing an MVC structure. While most of the JavaScript had to be rewritten to mesh with the Angular pattern, it was a worthwhile endeavor.

WebRTC Meets CSS

We ran into similar complexity issues with our CSS: the modularization wasn’t quite right, we had overlapping styles, etc. It wouldn’t be fair to say that our codebase was massive in comparison to a lot of other systems, but as a small team working at a startup: 1) time is essential and 2) organization is key.

The key to reducing complexity with our CSS was the usage of Sass and Compass.

The first time I used Sass, I wasn’t very pleased with it. But today, I can’t imagine life without it. In short, Sass is a CSS extension language that lets you write stylesheets in Sass syntax and compile them down to CSS.

Compass is a CSS authoring framework that builds on top of Sass and really helped us out with interface design. We used Compass to help us make sprites—there’s a great tutorial here, if you’re interested.

With Compass, we could speed up the creation of complex buttons. It was as simple as putting the buttons’ image states in a folder, at which point Compass would generate a sprite to contain them.

In this example, Compass reads all the files in the administration_buttons folder and put them into a single sprite file.

Through the @include administration_buttons-sprite(add_user) command, I add the file add_user.png as a background image for the add-user-button class.

With Compass, all I had to do was put some images in a folder.

Without Compass, I would’ve had to:

Put all the images in a single file.

Get all the coordinates for each image and put them on the class.

For a developer like me who’s not particularly Photoshop-savvy, this was a huge time saver. And, again, time is crucial at this stage.

WebRTC Meets Rails

After several months of JavaScript and CSS, we finally moved on to our Ruby on Rails back-end.

Rails patterns are great for keeping projects simple as there’s a very clear splitting with the MVC structure. If your system starts to grow, Rails patterns alone may be insufficient. Instead, you need to think about new, high-level ways to solve your problems—new means of abstraction.

We’ve all studied code patterns, ideologies, tutorials, guides, etc. We all know about splitting your logic into classes and your classes into functions and all that. But rarely are we taught solutions. I’ve never read a book about it and I’ve never seen a course on it.

After working with Rails for some time, I started to see that the problem was not the code, the problem was the solution to solve the client’s problem.

Back to WebRTC for Ondello: as a specific example, we had a very simple mail system built on top of Rails. For some time, this worked well for us. But eventually, we needed a special kind of email for every client. As time went on, the changes between emails began to grow larger and larger—in the end, every client required a totally different email.

If your system starts to grow, Rails patterns alone may be insufficient. Instead, you need to think about new, high-level ways to solve your problems—new means of abstraction.

I started to see that as my codebase grew, this kind of functionality would have a huge impact on my whole WebRTC framework and mail system. That is when we decided to create a separate project solely for the mail system, responsible for handling all kind of messages. The main project would just send (via REST) some parameters and a client identifier. Then, the mailing system would take care of assembling and sending emails.

This approach, that of splitting the project into different projects, was helpful when developing Ondello. It allowed the main project to remain simple and free of unnecessary responsibilities.

This kind of problem solving is what I call a solutions approach. I find it essential for developing complex applications.

Conclusion

No matter if you’re a Ruby, Java, or PHP developer, when it comes to web development, there are so many frameworks and technologies out there, so many different tools that you can or should use to make your system work, that in my eyes, it’s hard to see anyone as just a “Ruby” programmer or a “Java” programmer.

When it comes to big systems, you need to design well-crafted solutions that accommodate your clients needs—and that won’t always fit in with your favorite technologies or even the technologies with which you’re comfortable. The common theme in my development of Ondello was that I had to be flexible and agile, using the best-suited technologies available at any given moment regardless of my preferences.

In developing Ondello, I remembered that, above all, we must be engineers, solving problems with the best tools possible, and getting the job done in an efficient, scalable, and practical manner.

About the author

Alexandre is an expert Ruby on Rails developer who is also experienced with Java and various front-end technologies. In addition, he's worked extensively with WebRTC, building a video-based web apps from scratch. [click to continue...]

Comments

Eric Peterson

Your dedication to solving problems and building a great app is an inspiration. Seems like there's no part of the stack you didn't touch, and you didn't let yourself get stuck with what you were comfortable with. Your conclusion is great.

Юджин Юджин

Great article and experience.
Thanks for sharing your knowledge.
PS: Always thought that it is a painfull experience to work with web rtc, at least for now)

Jamie Lanister

Hey can you tell me which one is the best api out there for using webrtc? like opentok etc?

Alexandre is an expert Ruby on Rails developer who is also experienced with Java and various front-end technologies. In addition, he's worked extensively with WebRTC, building a video-based web apps from scratch.