You are here: Home » NewsFeeds » UDP for games – encryption and DDoS protection

UDP for games – encryption and DDoS protection

Updated on April 27, 2016By Tek Editor Comments Off on UDP for games – encryption and DDoS protection

[[This is Chapter XI(c) from “beta” Volume 2 of the upcoming book “Development&Deployment of Multiplayer Online Games”, which is currently being beta-tested. Beta-testing is intended to improve the quality of the book, and provides free e-copy of the “release” book to those who help with improving; for further details see “Book Beta Testing“. All the content published during Beta Testing, is subject to change before the book is published.

Why Encrypt??

“Yes, you DO need to encrypt your UDP traffic. And no, using UDP is NOT a valid excuse to skip encryptionYes, you DO need to encrypt your UDP traffic.1 And no, using UDP is NOT a valid excuse to skip encryption. Reasons for encrypting your traffic are numerous:

In short –

not only encryption protects your players from classical attacks, it also protects your games against cheaters too.

As a side bonus, with proper encryption you can be sure that network errors which corrupt your packets are not going undetected (and with unencrypted UDP those 16-bit UDP checksums can detect only one out of 60’000 in-transit corruptions, which means that with all those millions of packets you’re sending out each second, some corruptions WILL go undetected, causing all kinds of trouble).

“On the other hand, you need to keep in mind that having encryption does NOT eliminate the need to sanitize your data at the very least on the Server SideOn the other hand, you need to keep in mind that having encryption does NOT eliminate the need to sanitize your data at the very least on the Server Side of things (even with encryption in place, Client can be hacked to send your Server all kind of malicious data – from garbage to fakes).

1 to those coming from security side of things: of course, I don’t mean “just encrypt”, but “provide both confidentiality and integrity” (with authenticity on the side)

2 actually, we’re speaking about authentication here, but most of the protocols out there provide both encryption and authentication, so using “encryption” in common-in-game-industry sense as “everything crypto-related” is not as much of misnomer as it may look on the first glance

Isn’t Encryption Damn Expensive?

The next question is the following: well, we DO need to encrypt, but can we afford it? Won’t adding the encryption kill our servers CPU-wise? While we’ll discuss this issue in detail later in Chapter [[TODO]], for now we’ll need just a few very basic observations.

In short, there are two main ways to encrypt things: (a) using symmetric key (a.k.a. “symmetric crypto”, AES-128 and AES-256 being among the most popular ones), (b) using asymmetric keys (a.k.a. “public crypto”, with RSA-2048 still being quite popular in this department).

Symmetric crypto is damn cheap; for x86, it is usually of the order of “100+ Mbytes/second per core”.3 It means that if your server is serving 1’000 players sending 1Kbyte-sized packet 20 times a second to each of them (i.e. quite respectable 160Mbit/sec), symmetric crypto will cost you less than 1/5th of one CPU core.As on a usual “workhorse” server (see Chapter [[TODO]] for further discussion), there is currently around 8-12 of such cores, overall impact of symmetric encryption in such an example scenario amounts to about 2% of additional CPU load; if you ask me, 2% increase in number-of-servers-you-need-to-run is certainly not much to protect your players both from eavesdropping etc. and from cheaters.

“On the other hand, public crypto is MUCH more expensive, but fortunately, it is needed only to establish connectionOn the other hand, public crypto is MUCH more expensive, but fortunately, it is needed only to establish connection (and as a result of such public-crypto-while-establishing-connection, a symmetric key will be generated for subsequent symmetric crypto). Specific numbers vary greatly from one algorithm to another, but as a ballpark number, with TLS/DTLS we can take an estimate of 1’000 connections/second/x86 core.4 So, for our 1000-player server example above, even if all of your players got disconnected and then need to reconnect – you’ll need just about 0.1 second (using all your cores) to connect all of them. [QUIC.Crypto] protocol establishes connection at a significantly lower cost than TLS: for QUIC it is very roughly ~10x better, i.e. with QUIC we can get 10’000 connections/second/core. Note that while you MIGHT think that TLS’s 0.1 sec-to-reconnect-all-your-players is already good enough – we’ll see a bit later that connection establishment costs are VERY important from DDoS point of view (see “Resilience to Crypto-DDoS Attacks” section below).

3 that is, as of beginning of 2016 for AES family of crypters

4 that’s using ECDH+ECDSA, and once again, as of beginning of 2016

Contenders for UDP encryption: DTLS and QUIC

In practice, there are two protocols which can currently be used for practical UDP encryption: DTLS (using, for example, [OpenSSL]) and QUIC (using [libquic]). While other UDP-oriented protocols (such as SNEP/SPINS, CurveCP, or MinimaLT) are described in literature, to the best of my knowledge they lack readily-available-and-supported libraries,5 and writing your own crypto-related library usually qualifies as a Pretty Bad Idea for game development.

Now, let’s compare DTLS and QUIC. I won’t go into a lengthy discussion comparing them from theoretical security perspective; much more important for our purposes is an observation that

This is quite a pity, and restricts us to DTLS for lots of games out there; on the other hand, alternatives to using DTLS-for-all-communications include:

Using DTLS for fast-paced state-update stuff, and QUIC for slow-paced stream-based updates

Using QUIC for slow-paced stream-based updates, and skipping encryption for fast-paced updates completely

Note that this is NOT a really safe option from security perspective, so whenever substantial real money is involved, it SHOULD NOT be used. However, for quite a few games out there, it will work.

Potential attack here is about attacker modifying the (unencrypted/unsigned) data coming to the victim’s Client, therefore modifying the world which victim can see; on this way many nasty things become possible. If substantial money is at stake – such attacks CAN be mounted in practice (especially in certain environments such as uni campus).

If (in spite of my advice against it) you choose to go this way, MAKE SURE AT THE VERY LEAST to encrypt ALL the data going from Client to Server, and to encrypt ALL the data which is not 100% public (both these things are really important for several reasons!); if your Client-to-Server data or private data doesn’t fit well into QUIC reliable streams – tough luck, it means that you need to use DTLS.

Also keep in mind that for games such as stock exchanges, and for all the credit-card processing, it is usually significantly easier to convince auditors (in the latter case – PCI DSS auditors) that you’re fine security-wise, if you’re using TLS/DTLS (any other protocol will cause raised eyebrows, and in the best case you will need to justify why you’re deviating from what is usually deemed “industry best practices”).

5 Both known-to-me implementations of CurveCP (NaCl and libchlorium) seem to be pretty much abandoned as of beginning of 2016, and MinimaLT doesn’t seem to have any reasonably complete implementation too (which is a pity, as MinimaLT has the best DDoS protection from the whole bunch).

Bottom line: Chances are that you DO need DTLS

“While there are games out there where you can get away with QUIC, for quite a few games out there you will need to deal with DTLSWell, as it follows from the above, while there are games out there where you can get away with QUIC, for quite a few games out there you will need to deal with DTLS. Which is a pity, as DTLS is quite bulky and (as of DTLS 1.2) relatively slow.6

Resilience to Crypto-DDoS Attacks

When speaking about security, it is always about various attacks. For (properly) encrypted connections, dealing with attacks after connection is established, is usually not too difficult; however, DDoS attacks aiming at the connection handshake, become even more easy to mount, after we added encryption . I’m currently speaking mostly about “crypto-DDoS attacks”, when attacker is sending garbage within a properly formatted crypto request message and thus causes server to spend lots of time validating that the garbage is not really valid (see, for example, Pushdo SSL DDoS attack [Lewis12]). There is one positive side with this class of attacks though – amplification attacks (including very popular DNS amplifications attacks) usually don’t apply (phew); in particular, it means that 10GBit/s crypto-DDoS attack can count as “rather sizeable” one.

An Example crypto-DDoS Attack

Let’s do some example math. Let’s consider a moderately sized 10GBit/s non-amplified attack on a 100-server MOG (MOG handling like 100K players simultaneously). Let’s assume that our MOG system performs balancing (such as hardware Load Balancing or Front-End Servers, see Chapter VII for further discussion); also let’s assume that the attack is performed by 10000 PCs (each emitting 1Mbit/s on average), each PC having 4 cores on average. Let’s further assume that our ISP can handle these 10GBit/s for us.

“If our handshake packet is 50 bytes, it means that 10GBit/s attack can cause us ~25M connection requests/second.Now let’s see what it means for our game servers. If our handshake packet is 50 bytes, it means that 10GBit/s attack can cause us ~25M connection requests/second.

As noted above, the connection request requires public crypto, and with DTLS 1.2, we can process around 1000 of such connection requests per second per core. Let’s note that we cannot really dedicate ALL our cores to handling connection requests (the game should go on even when under attack), so let’s assume that we can dedicate one core per server to DDoS handling. It means that our 100 servers will be able to handle mere 100’000 connection requests/second (and we need 250x more to withstand the attack).

Such attacks can be a very unpleasant thing (and limiting incoming connections per IP is rarely an easy task for UDP, so DDoS protection by providers might or might not help in this regard, as thresholds may be too low to trigger protection at that level), so let’s see what we can do about it. Even using optimized algorithms/handshakes (such as those in QUIC) would make it only 10x better for us (still leaving us 25x short).

“Proof of Work” to the Rescue

One way to deal with it is to allow our Server to request Clients to perform some “proof of work” processing7 before we even start analyzing Client’s connection request. Under normal operation, there should be no “proof of work” requested, but if Server is under crypto-attack (which can be detected by time that Server spends on processing connection requests) – it should start requesting “proof of work” from all the Clients which try to connect.

“If we can force all the Clients to make some work which takes ~0.4 seconds of CPU core time to compute – then all 10’000 attacking PCs will be able to make only 100’000 requests/second, allowing us to withstand the attack.If we can force all the Clients to make some work which takes ~0.4 seconds of CPU core time to compute – then each of the 4-Core attacking PCs will be able to issue only 10 requests/second, and all 10’000 attacking PCs will be able to make only 100’000 requests/second, allowing us to withstand the attack. Even better, we don’t even need to calculate exact costs of work – our Server should simply increase amount of work requested while it is under crypto-attack, up to the point until it becomes not-so-affected by the attack. And BTW, if the attacker can see that the attack doesn’t affect you – he usually goes away fairly quickly.

The cost we’re paying for this kind of protection is that we’re causing Clients (including legitimate ones) to connect more slowly while the server is under attack; however, delay of 0.4 seconds is pretty much nothing (and I would argue that even 100x-larger 40 second delay is still better than usual outcome of a DDoS, which is “being unable to connect for hours”).

7 We’ll discuss “how to implement this ‘proof of work’ stuff” a bit later in “Implementing “Proof of Work” on top of DTLS” section.

Implementing “Proof of Work” on top of DTLS

Actually, an idea to use “proof of work” to mitigate DDoS attacks to oblivion is certainly not new; it is known at least since [Juels99] and is a part of at least [MinimaLT] protocol.

Below I’ll describe one of the ways of implementing “proof of work” (with an ideology similar to “puzzles” in MinimaLT) on top of the existing DTLS protocol (and on top of a 3rd-party DTLS library):

On the Server-Side, we have a very separate secret key (let’s name it “PuzzleKey”); it MUST be completely independent from all the other keys (for example, taken as a crypto-quality random number) and SHOULD be regenerated from scratch at least on each server restart. An interesting detail is that this key does NOT need to be shared with any other party (so it MUST stay internal to our server)

On the Server-Side, let’s “intercept” all the datagrams sent by the Server (i.e. get output of your DTLS library before it gets sent to the UDP socket) – and find all the DTLS records known as HelloVerifyRequest ones (yes, this CAN be done without breaking encryption or knowing the keys).

“HelloVerifyRequest itself is intended to prevent DDoS, and it does prevent a certain class of DDoS attacks, but not a cryptographic DDoSSide note: HelloVerifyRequest itself is intended to prevent DDoS, and it does prevent a certain class of DDoS attacks (those similar to SYN flood attacks), but not a cryptographic DDoS

If we see a HelloVerifyRequest record, we modify the datagram-which-carries-HelloVerifyRequest by adding three fields:

Current Server time (it is better to use something along the lines of std::steady_clock here)

Challenge (128 or so crypto-random bits should do nicely; 64 crypto-random bits or so should probably do too in practice, though it is a bit less obvious; I’d rather NOT go below 64 bits)

Amount-of-work (8 bits will be more than enough)

The value of this Amount-of-work depends on the current state of the crypto-attack on the server; if there is no attack detected – Amount-of-work should be 0, if there is an attack which affects the Server – Server SHOULD start incrementing it slowly. When not under attack – Server SHOULD reduce it back (all the way down to zero).

”MAC”In cryptography, a message authentication code (MAC) is a short piece of information used to authenticate a message—in other words, to confirm that the message came from the stated sender (its authenticity) and has not been changed in transit (its integrity).— Wikipedia —Some kind of MAC signing tuple of (Current-server-time,Challenge,Amount-of-work,cookie-from-HelloVerifyRequest); MAC MUST be calculated using “PuzzleKey”.

For DDoS protection purposes, we need the fastest MAC possible, and from my experience HMAC is somewhat slower for this message sizes (at least on x86 platforms) than CBC-MAC (prepending the tuple length in bytes to our tuple before calculating MAC to make CBC-MAC secure) or CMAC/OMAC

On the Client-Side we “intercept” all the datagrams received by the Client right from the UDP socket (that is, before they reach our DTLS library), extract this (Challenge, amount-of-work, MAC) tuple out of it, and put it aside for a little while. After the extraction we strip all the additional data from the datagram (so that the DTLS library on the Client gets the same message as it was sent by DTLS library on the Server)

When Client DTLS library responds with a ClientHello record – we again “intercept” the datagram, adding the following fields to it:

Current-Server-time (simply copied from server request)

Challenge (also copied from server request)

Amount-of-work (also copied from server request)

MAC (also copied from server request)

Puzzle solution: number N, which has SHA-1(N||Challenge)89 to have first Amount-of-work bits from it as zeros.

Again on the Server-Side, we “intercept” all the datagrams coming from UDP socket, and are looking for the one with a ClientHello record – and extract (Server-time,Challenge,Amount-of-work,MAC,N) tuple. Then, before passing the datagram to the Server-Side DTLS library (which would require public crypto and therefore would incur substantial CPU costs), we:

Check Server-time for sanity (it should always be less than our current Server time, and should be within some reasonable time window of current Server time – in other words, if the message goes back for 24 hours, something is probably wrong here)

If this check fails – send a special datagram back (on receiving such a special datagram, Client should re-establish connection from scratch)

Check that MAC field extracted from the record, does authenticate tuple (Challenge,Amount-of-work,cookie-from-ClientHello); the check MUST be done using PuzzleKey. This is a symmetric-crypto (=”very cheap”) operation.10

Check that number N does satisfy “SHA-1(N||Challenge) has first Amount-of-Work bits as zeros” condition. This is a SHA-1 operation, which is very cheap too.

Only if all the checks are ok – we’ll strip the extra fields (so that “datagram looks exactly as it was emitted by Client-Side DTLS library”) and pass the datagram to our Server-Side DTLS library

“The trickery described above effectively acts as an additional DDoS-protected transport layer for DTLS; in other words, it doesn’t change anything from DTLS point of view (which means that DTLS security remains perfectly intact)The trickery described above effectively acts as an additional DDoS-protected transport layer for DTLS; in other words, it doesn’t change anything from DTLS point of view (which means that DTLS security remains perfectly intact); it merely sends extra challenges (when Server feels that it is under attack) and filters out packets coming from those attackers who were careless enough to skip doing ‘proof-of-work’.

The idea here is that while there can be two different types of crypto DDoS attack (calculating Puzzles and not calculating Puzzles) on such protected-DTLS, handling both of them is much cheaper than handling an attack on an unprotected DTLS.

If the attacker chooses to calculate Puzzles (and solving a Puzzle is 2^Amount_of_work more expensive than checking it) – then we’ll be able to mitigate the DDoS attack at the cost of each Client performing 0.4 sec worth of CPU core calculations (with 2016 CPUs, very roughly corresponding to Amount-of-Work = 19 or so). If the attacker decides to flood us with fake ClientHello’s without solving the Puzzle – we’ll be performing only very cheap operations (such as one MAC + one SHA-1 calculation), and will be able to do (roughly) 500K such checks per second per core (or 50M checks/seconds using only a single core from all our 100 servers), which is above 25M packets/second which we need to survive our example crypto-DDoS.

As an added bonus, this kind of checks can be even offloaded to separate servers (and at least in theory – even to the servers within your DDoS-protection provider).

On the other hand, note that this additional layer is certainly not a silver bullet; for example, if all our attacking PCs have a GPU such as GTX Titan-X, they will be able to calculate our Puzzles at ~100x faster than CPU, which will force us to increase Client calculation times to about 40 seconds (that’s per core); even this would be better than not-being-able-to-connect-forever, but in reality it won’t be that grim for two reasons:

Fortunately, not all the PCs-forming-the-botnet are that powerful

If your game is a PC-based 3D one, you yourself can use GPU to solve the “Puzzle”, reducing Client-Side connection delays by the same factor of 100x or so

8 here || denotes concatenation.

9 IMNSHO, for “proof-of-work” purposes, using SHA-1 is ok, but if you prefer – you can use SHA256 etc. instead, though it will incur some additional CPU costs on the server side.

10 or “crypto hash operation” if we’re using HMAC – also “very cheap”

Protection from crypto-DDoS: do you really need it?

The trick to protect yourself from crypto-DDoS described above, is not that complicated, but will certainly take some time to implement. As a result, a reasonable thing to ask is “whether you really need to implement it in advance?”. Honestly, I do not have a firm answer to this question. On the one hand, when you don’t have such protection, crypto-DDoS attack can bring your system to the knees in no time (and protection by DDoS provider might happen to be insufficient). On the other hand, at least as of 2016 crypto-DDoS attacks are very uncommon. Whether somebody will mount a crypto-DDoS attack against your servers – well, you never know in advance.

“Personally, I prefer to think of it as of insurance – when I’m paying my premiums in hope that my money will go to waste.For high-profile games, I would suggest to play it safe and to implement it somewhere around “beta” stages of the game (as changing protocols during “live” game is usually significantly more complicated); OTOH, chances are that you’ll never need to use this feature. Personally, I prefer to think of it as of insurance – when I’m paying my premiums in hope that my money will go to waste.11

11 especially if it is a life insurance

Common Encryption-Related Notes

When implementing encryption (whether over TCP or over UDP), there are several very important things to keep in mind; while a detailed discussion on these issues will follow in Chapter [[TODO]], here I will simply summarize the most important points out of it without going into explanations:

DO check Server-Side certificate on the Client

To generate Server-Side certificate, DO run your own Certificate Authority and embed root CA certificate within your Client. DO NOT use root certificates which come installed into Client OS. NB: this is a security-by-obscurity feature, which is not needed for non-game apps. Also it does NOT apply to stock exchanges and alike games (more precisely – to games where there is no risk of Client being hacked). For detailed discussion, see Chapter [[TODO]]

DO obfuscate root CA certificate as it is stored within your Client. NB: again, it is a security-by-obscurity feature, not necessary outside of games or for stock exchanges etc.

Regarding choosing a DTLS library:

“for not-so-security-critical games, I would say that it doesn’t matter too much which (D)TLS library you’re using.for not-so-security-critical games, I would say that it doesn’t matter too much which (D)TLS library you’re using. Keep away from abandoned libraries (in this field, “any library with more-than-half-a-year-since-last-update” qualifies as “abandoned”), proprietary libraries, libraries which have only GPL open-source licenses,12 and libraries which don’t support DTLS 1.2, and you should be fine.

Personally, I have had quite good experience with OpenSSL (and no, Heartbleed did not change my positive take on OpenSSL); however, feel free to experiment with GnuTLS,13 mbed TLS (former PolarSSL), and Botan.14

Note that if you do NOT need UDP/DTLS – choice of TLS libraries becomes wider; see [[TODO]] section below for discussion

For Really Security-Critical Games (such as stock exchanges) – I would try to make double-encryption (and yes – I’ve done it myself too). In particular, such double-layer encryption MAY be structured as follows:

An “outer” layer of encryption would be just your usual transport-layer encryption (covering TCP which goes from Server to Client).

An “inner” layer of encryption would be point-to-point encryption going from Server-Side Event-Driven Object to Client-Side Event-Driven Object. You MAY keep this layer optional (just for most-critical-messages) or all-the-time depending on your game.

If you want to be Really Secure – use different (and Really Independent) libraries for each of these layers (and use different cipher suites too). Then, a bug in any of the libraries won’t hit your security too much.

As a side bonus, you will be able to brag about your system being double-encrypted

While we’re at it: you MUST be 200% sure that none of the keys (or more generally, no encryption state) is ever shared between two encryption layers. If your encryption layers are not 100% independent – you can easily end up with a completely-insecure thing.

Which cipher suite to use: to certain extent it is a matter of personal choice, but as of beginning of 2016, I would pick something along the following lines:

for not-so-secure games: something like TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256; that is, if your DTLS library supports it (and if not – try to pick something which is not too different, AND MORE IMPORTANTLY – something which doesn’t have anything marked insecure in [Wikipedia.TLS]15)

if using ECDSA, you MAY settle for 160-bit keys, for RSA – for 1024-bit keys (NB: we’re still speaking about not-so-security-critical games here). Yes, 224/233 ECDSA and 2048 RSA keys are better, but you need to double-check impact of DDoS attacks (and probably think about their mitigation) before going there.

one note about ECDSA in our context: as long as we’re using downloadable clients, we can use pretty much any elliptic curve supported by our DTLS library. However, if you’re dealing with browser clients – you DO need a “compatible” ECC curve such as P-256 (or even better – have a fallback to good old RSA).

In particular, DO disable DTLS 1.0 (as you control both sides of communication, backward compatibility is not an issue)

NEVER EVER touch with a 6-yard stick any cipher suite which uses ADH (=”Anonymous Diffie Hellman”, a.k.a. ANON_DH).16 The same goes for AECDH (ANON_ECDH), RC4, and MD5.17

DO disable all the cipher suites which you’re not going to use. Compile out (using #defines) whatever you can compile out, and disable in configuration whatever you don’t need, but cannot compile out.

For the Client-Side, DO link your TLS/DTLS library statically. It means NOT using TLS/DTLS which come with your OS. NB: once again, it is a security-by-obscurity feature, applicable only in game environments (which need to rely on it because no other protections are really available); moreover, in some of non-game environments such practice can be seen as detrimental to security.18

12 they will require either to release all your source code, or purchase a commercial license

13 GnuTLS is licensed under LGPL license, which is usually ok for using in commercial projects – but double-check with your legal guys if applicable

14 On other potentially worthy contenders: WolfSSL and MatrixSSL only have GPL open-source versions (and require commercial license otherwise), LibreSSL doesn’t seem to support DTLS 1.2, and NSS has a strong dependency on NSPR (and you’re very unlikely to use NSPR otherwise),

15 while I admit that using Wikipedia as a reference-to-determine-security-of-ciphersuite is insecure by design, for not-so-secure games it might still fly

16 believe it or not, I’ve seen it more than once in the wild, though not for DTLS

17 while RC4 MAY have some uses in obfuscation department, and MD5 MAY be used as an improved version of CRC, even such innocent uses may cause quite a bit of trouble – both by being spread around by copy-paste, and by auditors asking all kinds of questions, so I suggest to refrain from using these ugly beasts. For example, for obfuscation purposes Chacha20 is a very good replacement to RC4 (it is even faster).

18 on the other hand, for stock exchanges, I tend to trust my own Client app better than anything-installed-on-end-user-computer, but eventually it often becomes a legal issue elevated outside of developer’s realm

[[To Be Continued…

This concludes beta Chapter XI(c) from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”. Stay tuned for beta Chapter XI(d), describing optimizing TCP for game-like uses.]]