Building A Real-Time Retrospective Board With Video Chat

Quick Summary

If you’ve ever worked in an agile environment, chances are you’ve had your share of “retrospectives” — meetings where people write what made them “glad,” “mad” or “sad” onto different-colored notes, post them onto a board, arrange them in groups and — most importantly — talk about them.
These meetings are straightforward, as long as everyone is in the same room. But if you’re working with a locally distributed team, things can get a bit tricky. Let’s address this by creating a virtual version of our board to allow team members in different locations to hold their retrospective just as if they were in the same room.

Table of Contents

If you’ve ever worked in an agile environment, chances are you’ve had your share of “retrospectives” — meetings where people write what made them “glad,” “mad” or “sad” onto different-colored notes, post them onto a board, arrange them in groups and — most importantly — talk about them.

These meetings are straightforward, as long as everyone is in the same room. But if you’re working with a locally distributed team, things can get a bit tricky.

Nice. Now, let’s stop it again. What, why? Well, at the moment, our server is open to the world. Anyone can connect to it and learn what happened in our retrospective. Let’s make sure that every user connecting to deepstream at least knows the password, sesame. To do this, we need to register a permissionHandler — an object that checks whether a client is allowed to log in and whether it may perform a certain action. So, let’s use the same tutorialServer.set() method we’ve used before.

So, are we connected and ready for some real-time awesomeness? Um, not quite. At the moment, our connection is in a kind of quarantine state, waiting for the user to log in. To do this, we’ll create the world’s most basic log-in form:

Once the user hits the log-in button, we’ll read the values from the form, send them to deepstream using its login() method and wait for the response. Should the response be positive (success === true), we’ll hide the log-in form and start the application. Otherwise, we’ll show the error message that we set in permissionHandler earlier (for example, callback( ‘No username specified’ );).

Building The Board

Phew! Finally, we’ve got all of the log-in bits out of the way and can start building the actual UI. But first, let’s talk about records and lists. Deepstream’s data sync is based on a concept called “records.” A record is just a bit of data — any JSON structure will do.

Collections of records are called lists. A list is a flat array of record names. It has methods similar to a record’s but also some specific ones, like hasEntry() and removeEntry(), as well as list-specific events, such as ‘entry-added’.

For our board, we’ll use both records and lists. The board will be represented as a list, and each sticky note will be an individual record.

Wiring It Up To The DOM

Now that we’re armed with this knowledge, the next thing to do is set the sticky note’s text in the record whenever the user changes it — and update the DOM whenever a change comes in. If we use a textarea field, here’s what that would look like:

The Hard Bits

Easy enough so far. At this point, your changes will already sync across all connected clients. So, let’s add some dragging to our sticky notes.

This should be fairly straightforward. We’ll just use jQuery’s draggable functionality, and whenever the position changes, we’ll update both the DOM element’s position and the value of the record’s position. OK? But then we’ll also need to subscribe to the record’s position field to apply incoming changes — in which case, we’ll need to differentiate between local and remote changes. Surely, an if condition would… STOP!

Let me stop you right there. Building a collaborative real-time app can be very hard — or very easy, depending on your approach. Don’t try to orchestrate different callbacks to keep local and remote changes in sync. Make your life easier and just use the record as a single source of truth. To stick with our draggable example, here’s what I mean:

Notice how the dragging and DOM updates are now decoupled. We’ll apply a similar concept to our sticky note list. Whenever the user clicks “Add note,” we’ll add an entry to the list. Whenever an entry is added to the list (whether locally or by another user), we’ll add a note to the board.

These should be all of the main building blocks of our board. Thanks for holding out with me for so long. I’ve skipped a few lines that wire things together; to see the full code, please have a look at the GitHub repository.

Adding Video Chat

Now it’s time to tackle the video-chat part of our retrospective board.

Video chat via WebRTC.

Retrospectives are all about people talking to each other. Without communication, even the best collection of suggestions and feedback will remain unused.

Let’s Talk About WebRTC

Chances are that if you’re working in web technology, you’ve come across WebRTC. It’s an exciting new standard that allows us to transmit audio, video and even data streams directly between browsers without having to route them through a server.

However, as far as browser APIs go, WebRTC is one of the most complicated ones. And despite being a peer-to-peer protocol, it still requires a server. The reason for all of this is that in order to connect two browsers, both have to know where the other one is — and that is way more complicated than it sounds.

Imagine a friend asking for your address. You answer, “I’m in the bedroom” — leaving it to them to find out which house your bedroom is in, which street your house is on, which town that street is in and so on. And once they can reliably locate your bedroom, you still have to provide a list of windows and doors they have to try to see if one is open.

Deepstream tries to abstract all of that away and reduce WebRTC to two concepts: a phonebook and a call. We’ll use both to create a video chat room that allows our team to talk about what’s happening on the retrospective board.

Connect The Streams

Video in a browser comes in the form of a MediaStream. These streams are a combination of audio and video signals that can be played in a video element or sent to someone else via the Internet. You can retrieve a stream from a webcam or microphone, from another user via WebRTC or, once captureStream is fully supported, even from a canvas element.

Getting Your Local Webcam Stream

Let’s start with our local webcam and microphone stream. It can be retrieved using getUserMedia — and immediately the trouble starts. getUserMedia has been around for a while now, but the API is still not fully standardized and, therefore, is still vendor-prefixed. But help is at hand. The official WebRTC initiative maintains an adapter script that normalizes browser differences and stays up to date with API changes. You can find it on GitHub.

Once it’s installed, retrieving your local video and audio stream and playing it in a video tag is as simple as this:

Make Sure To Handle Errors

Whenever an application requests access to a user’s webcam or microphone, a lot of things can go wrong. A user might not have a webcam at all, might have a webcam but no microphone, might have a webcam that is not able to provide the required resolution, or might have a webcam that simply is not allowed access to their media devices. All of these cases are captured in getUserMedia’s error callback. Have a look at the official specification for the full lists of errors that could occur.

Registering For Incoming Calls

Now that we’ve got our local video stream, it’s time to add ourselves to the phonebook and listen for others adding themselves. To let the others know who we are, we’ll use the user name we’ve logged in with.

ds.webrtc.listenForCallees will invoke this.onCallees immediately with a list of all currently registered callees and then again whenever another users is added or removed from the phonebook.

This will help us solve an inherent problem of peer-to-peer systems: rooms.

The Problem With Rooms

Rooms are a common concept in every chat application: A number of participants all talk to each other at the same time. With a centralized server, this is easy: You log in and get every participant’s video stream. With a network of peer-to-peer connections, however, things are a bit trickier.

To create a room, every participant has to connect to every other participant exactly once.

that the whole phonebook (i.e. the array of callee names, provided by listenForCallees) constitutes one room;

that every new user has to call all currently present users (this way, the first user to log in won’t call anyone, the second user will call the first, the third user will call the other two and so on).

Waiting For Incoming Calls

Great! We’re now connected to everyone who’s in the room. The bit that’s left is to accept incoming calls from new participants. When we’ve registered ourselves as a callee, we’ve provided a callback function for incoming calls:

That’s it! From now on, every time you log into the retrospective board, your webcam will spring to life, you’ll be connected to all other members of your team, and every new joiner will automatically connect to you.

Easy enough so far. At this point, your changes will already sync across all connected clients. So, let’s add some dragging to our sticky notes.

This should be fairly straightforward. We’ll just use jQuery’s draggable functionality, and whenever the position changes, we’ll update both the DOM element’s position and the value of the record’s position. OK? But then we’ll also need to subscribe to the record’s position field to apply incoming changes — in which case, we’ll need to differentiate between local and remote changes. Surely, an if condition would… STOP!

Let me stop you right there. Building a collaborative real-time app can be very hard — or very easy, depending on your approach. Don’t try to orchestrate different callbacks to keep local and remote changes in sync. Make your life easier and just use the record as a single source of truth. To stick with our draggable example, here’s what I mean:

Notice how the dragging and DOM updates are now decoupled. We’ll apply a similar concept to our sticky note list. Whenever the user clicks “Add note,” we’ll add an entry to the list. Whenever an entry is added to the list (whether locally or by another user), we’ll add a note to the board.

These should be all of the main building blocks of our board. Thanks for holding out with me for so long. I’ve skipped a few lines that wire things together; to see the full code, please have a look at the GitHub repository.

Adding Video Chat

Now it’s time to tackle the video-chat part of our retrospective board.

Video chat via WebRTC.

Retrospectives are all about people talking to each other. Without communication, even the best collection of suggestions and feedback will remain unused.

Let’s Talk About WebRTC

Chances are that if you’re working in web technology, you’ve come across WebRTC. It’s an exciting new standard that allows us to transmit audio, video and even data streams directly between browsers without having to route them through a server.

However, as far as browser APIs go, WebRTC is one of the most complicated ones. And despite being a peer-to-peer protocol, it still requires a server. The reason for all of this is that in order to connect two browsers, both have to know where the other one is — and that is way more complicated than it sounds.

Imagine a friend asking for your address. You answer, “I’m in the bedroom” — leaving it to them to find out which house your bedroom is in, which street your house is on, which town that street is in and so on. And once they can reliably locate your bedroom, you still have to provide a list of windows and doors they have to try to see if one is open.

Deepstream tries to abstract all of that away and reduce WebRTC to two concepts: a phonebook and a call. We’ll use both to create a video chat room that allows our team to talk about what’s happening on the retrospective board.

Connect The Streams

Video in a browser comes in the form of a MediaStream. These streams are a combination of audio and video signals that can be played in a video element or sent to someone else via the Internet. You can retrieve a stream from a webcam or microphone, from another user via WebRTC or, once captureStream is fully supported, even from a canvas element.

Getting Your Local Webcam Stream

Let’s start with our local webcam and microphone stream. It can be retrieved using getUserMedia — and immediately the trouble starts. getUserMedia has been around for a while now, but the API is still not fully standardized and, therefore, is still vendor-prefixed. But help is at hand. The official WebRTC initiative maintains an adapter script that normalizes browser differences and stays up to date with API changes. You can find it on GitHub.

Once it’s installed, retrieving your local video and audio stream and playing it in a video tag is as simple as this:

Make Sure To Handle Errors

Whenever an application requests access to a user’s webcam or microphone, a lot of things can go wrong. A user might not have a webcam at all, might have a webcam but no microphone, might have a webcam that is not able to provide the required resolution, or might have a webcam that simply is not allowed access to their media devices. All of these cases are captured in getUserMedia’s error callback. Have a look at the official specification for the full lists of errors that could occur.

Registering For Incoming Calls

Now that we’ve got our local video stream, it’s time to add ourselves to the phonebook and listen for others adding themselves. To let the others know who we are, we’ll use the user name we’ve logged in with.

ds.webrtc.listenForCallees will invoke this.onCallees immediately with a list of all currently registered callees and then again whenever another users is added or removed from the phonebook.

This will help us solve an inherent problem of peer-to-peer systems: rooms.

The Problem With Rooms

Rooms are a common concept in every chat application: A number of participants all talk to each other at the same time. With a centralized server, this is easy: You log in and get every participant’s video stream. With a network of peer-to-peer connections, however, things are a bit trickier.

To create a room, every participant has to connect to every other participant exactly once.

that the whole phonebook (i.e. the array of callee names, provided by listenForCallees) constitutes one room;

that every new user has to call all currently present users (this way, the first user to log in won’t call anyone, the second user will call the first, the third user will call the other two and so on).

Waiting For Incoming Calls

Great! We’re now connected to everyone who’s in the room. The bit that’s left is to accept incoming calls from new participants. When we’ve registered ourselves as a callee, we’ve provided a callback function for incoming calls:

That’s it! From now on, every time you log into the retrospective board, your webcam will spring to life, you’ll be connected to all other members of your team, and every new joiner will automatically connect to you.

As with the first part of the tutorial, I’ve skipped a few lines that wire things together. To get the full script, please look at the GitHub repository.

Is That All There Is To Building Production-Ready Video Chat?

Well, almost. WebRTC is used in production in large-scale apps like Google Hangouts and Skype for Web. But the developers of those apps had to take some detours to achieve their quality of service.

Hangouts relies on a number of non-standard features built specifically into Chrome (and available as plugins for other browsers), whereas Skype for Web is investigating a parallel standard, called Object Real-Time Communication (ORTC), which is currently supported only by IE Edge.

That might sound an awful lot like the standards battles of the past, but things are actually looking quite promising this time: ORTC isn’t meant to compete with WebRTC, but rather to augment and ultimately complete it. It is designed to be shimmable and, finally, merged with WebRTC in the next version after 1.0.

But Why Is It Necessary?

Production-ready RTC apps use a number of techniques to achieve a solid user experience across devices and bandwidths. Take Simulcast, which allows us to send different resolutions and frame rates of the same stream. This way, it leaves the recipient to pick a quality to display, rather than performing CPU-intensive on-the-fly compression; it is, therefore, a fundamental part of most video chats. Unfortunately, Simulcast has only just made it into the WebRTC 1.0 specification. It is, however, already available in ORTC.

The same is true for a number of other low-level APIs. WebRTC is well usable and ready to go, but not until the consolidation with ORTC and the final alignment of browser video codecs will it be fully usable in production.

Until then, great low-level libraries like SimpleWebRTC and adapter.js will be around to bridge the gap, and high-level technologies like deepstream give developers a head start on building a solid RTC project without having to worry much about its internals.