Some random thoughts about crypto. Notes from a course I teach. Pictures of my dachshunds.

Matthew Green

I'm a cryptographer and professor at Johns Hopkins University. I've designed and analyzed cryptographic systems used in wireless networks, payment systems and digital content protection platforms. In my research I look at the various ways cryptography can be used to promote user privacy.

I don’t have much to say about Gauss that hasn’t been covered elsewhere. Still, for those who don’t follow this stuff routinely, I thought I might describe a couple of the neat things we’ve learned about it.

Here’s the nutshell summary: Gauss is your basic run-of-the-mill government-issued malware, highly modularized and linked to the same C&C infrastructure that Flame used. It seems mainly focused on capturing banking data, but (as I’ll mention in a second) it may do other things. Unlike Flame, there’s no evidence that Gauss uses colliding MD5 certificates to get itself onto a host system. Though, in fairness, we may not yet have the complete picture at this point.

So if there are no colliding certificates, what’s interesting about Gauss? So far as I can tell, only two things. First, it installs a mystery font. Second — and far more interesting — it contains an encrypted payload.

Palida Narrow.Every Gauss-infected system gets set up with a new font called Palida Narrow, which appears to be a custom-generated variant of Lucida Bright with some unusual glyphs in it. From Kaspersky’s report:

[Gauss] creates a new TrueType font file “%SystemRoot%\fonts\pldnrfn.ttf” (62 668 bytes long) from a template and using randomized data from the ShutdownInterval key.

Now this looks exciting! Unfortunately Kaspersky has not explained how the randomization works, or indeed if the data is truly random. This leaves us with nothing to do but speculate.

And plenty of folks have. Theories range from the practical (remote host detection) to the slightly wild (on-site vulnerability fuzzing). My favorite is the speculation that Palida is used to steganographically fingerprint the author of certain printed materials. While this theory is almost certainly wrong, it’s not completely nuts, and even has some precedent in the research literature.

Godel should be setting your hair on fire, if only because it attempts to replicate itself via a vulnerability in the code that Windows uses to handle USB sticks. This the very same vector that Stuxnet used to infect the air-gapped centrifuge controllers at Natanz. It’s a good indicator that Godel is targeted at a similarly air-gapped system.

Of course the question is: which system? Godel goes to great lengths to ensure that we don’t know.

Presumably the designers made this decision based on some bitter experience with Stuxnet, which didn’t protect its code at all. The result in Stuxnet’s case was that researchers quickly decompiled the payload and identified the parameters that it looked for in a target systems. Somebody — presumably Stuxnet’s handlers — were unhappy about this: on July 15, 2010 a distributed denial of service attack crippled the industrial control mailing listservs where this was being discussed.

To avoid a repeat of this episode, Gauss’s designers chose to encrypt the Godel payload under a key derived from a specific configuration on the targeted computer.

The details can be found in this Kaspersky post. To make a long story short, Godel derives an encryption key by repeatedly MD5 hashing a series of (salted) executable filenames and paths located on the target system. Only a valid entry will unlock the program sections, allowing Godel to do its job.

The key derivation process is performed using 1000 10,000 iterations of MD5. The resulting key is fed to RC4. While the use of RC4 and MD5 may seem a little bit archaic (c’mon guys, get with the 21st century!), it likely reflects a decision to use the Microsoft CryptoAPI across a broad range of Windows versions rather than some sort cryptographic retro-fetish.

The real question is: how well does it work?

Probably very well, with caveats. Kaspersky says they’re looking for a world-class cryptographer to help them crack the code. What they should really be looking for is someone with a world-class GPU.

As best I can see, the only limitation of Gauss’s approach is that the designers should have used a more time-intensive function to derive their keys. 1000 10,000 iterations of MD5 sounds like a lot, but really it isn’t; not in a world with efficient GPUs that you can rent. This code won’t be broken based on weaknesses in RC4 or MD5. It will be broken by exhaustively searching the file path/name space using a GPU (or even FPGA)-based system, or possibly just getting lucky.

No doubt Kaspersky is working on such a project right now. If we learn anything more about the mystery of Godel, it will almost certainly come from that work.

Well according to Kaspersky, it is 10,000 iterations. This probably translates to a few ms on an average PC; on mine, using Java, 5ms.

Of course it's still not a lot, but I'm not sure it can easily be increased significantly in this case. Indeed, the case is quite different from password hashing, in that the malware does not need to perform this calculation only once, but has to do it for each combination of values, and I guess this can quickly mean thousands of entries. Hence, we end up with the overall calculation which can take on the order of seconds. Question is, did they have a requirement on how long max the thing can take to run? Clearly we don’t know enough about the malware yet to answer such questions.

Still, an interesting question is: why did they choose to use this exact scheme for deriving the key? I’m tempted to think that they could have performed the same targeting using only the filename; I cannot come up with a scenario where they would need to target systems where there is the required filename, and in addition, some specific binary in the PATH variable. Hence, I think that combining the filenames with the PATH variable was simply a way to increase the number of possible values you have to go through to crack the key. Hence, it could have been a tradeoff, reducing number of iterations to increase the search space.