The Best WebRTC Security is Prone to the Stupidest Developer

WebRTC is the most secure technology for video communications. And yet – developers can screw this for you.

There is a rise in security breaches and data theft incidents in 2016. You see this from the amount of information out there. I’ve written about WebRTC and security for quite some time, but a recent post I’ve read compelled me to write about it again.

It probably happens more often than not. You build a service. You take care of its security. And then, someone down the lines screws you over with his maintenance processes. To some extent, this is just as bad as social engineering, where a hacker tries to gain access by fooling people to believe he is someone else.

Print it and stick it on the wall behind your monitor so you don’t forget.

WebRTC Security baseline

WebRTC comes with a few security concepts that are quite new and innovative in VoIP:

In WebRTC, EVERYTHING is encrypted. Not only by default, but also in a way that can’t be modified – there is no way to send data over WebRTC in the clear

WebRTC forces you to operate over HTTPS and WSS in your web application, so signaling gets encrypted as well

Screensharing may require an additional layer of consent, like the creation of a browser extension in Chrome

Browsers today update frequently and automatically, so any security threat found gets patched faster than most enterprise and VoIP vendors react to their security breaches

The thing people forget is that WebRTC is just a piece of technology. A building block. It is up to the developers to decide how to use it in their own product. During that integration, security breaches can be created quite easily.

In the WebRTC course I launched two months ago, I’ve added a lesson dealing with WebRTC security. It goes through the mechanisms that exist in WebRTC and the areas that need to be further secured by the application.

Two big issues left to developers today are TURN passwords and access to backend server resources.

#1 – TURN passwords

TURN servers predate WebRTC. They are used by SIP (or at least are found in the spec), and there, the notion is that the user agent (=device/endpoint) is secure and “named”. So a username and password mechanism was created to get a TURN binding. The reason you want such a mechanism in the first place is because TURN servers are bandwidth hogs – they relay media, and by doing that they cost a lot in terms of bandwidth. So if you are paying for it, you don’t want others to piggyback on it.

The problem with this approach in WebRTC is that the username and password needs to be passed from your JavaScript code inside the browser to the server. Which means that information is available in the clear for many use cases – those where you don’t need or want the user to identify with the network at all.You also don’t want someone sniffing your code in the browser and then reusing these credentials elsewhere.

The current approach out there is to use temporary passwords (I like calling them ephemeral – it makes me sound intelligent). Ones that become useless in an hour or two.

This means that someone in your backend randomly creates a password that is short-lived and shares it with both the TURN server and the client.

The above illustrates how this is done.

The App Server, in charge of signaling in this case, creates a password. It updates the TURN server about said password and also gives that information to the User

The User then creates a peer connection, configuring the TURN server in it with the relevant temporary password

Great.

Now lets add a media server into the mix.

Who should be generating that password and passing it around to whom? Should the Media Server now be in charge of it, or is it up to the App Server still to take care of this?

Which leads me to the second important security aspect of WebRTC when it comes to your development – backend server resources you need to protect.

#2 – Backend server resources

In many cases, I find that when the work is outsourced, the end result tends to be a jumble of an architecture if things aren’t thought out properly from the beginning.

This usually causes the wrong servers to need to connect and communicate directly with the User. While not an issue on its own, it can easily turn into a headache:

Not having a clear picture of the state in your backend means you lose control – this can turn ugly when issues arise

Opening up more of your backend towards the internet means more points to secure against penetration

And yes – I know there’s a trend to treat servers in the cloud as if they are always open to the internet

Which means you need to think about how best to protect them in the first place anyway, which happens to be closing them as much as possible

What I suggest in many cases is:

Media servers should never be controlled or accessed directly from the Internet

Media servers should only pass media to and from the Internet

Whenever they need to be controlled, you do that using backend-to-backend communication from other servers you have that are already managing the users on the Internet

What’s next?

I am not a security expert. I know a bit about it and try to stay informed, but I am by no means an expert in it.

You should make sure to take security into consideration when developing your service and don’t assume WebRTC does everything for you. It doesn’t, but it is the best starting point you’ll get.

If you want to learn more about WebRTC, I will be opening the course again for another round. Probably during April.

If you are a corporate looking to have an open access to course materials throughout the year for your workforce – I am going to announce such a plan soon, but feel free to reach out to me before that happens.