Twitter RNG is Powered By Memes

Twitter is kind of a crazy place. World leaders doing verbal battle, hashtags that rise and fall along with the social climate, and a never ending barrage of cat pictures all make for a tumultuous stream of consciousness that runs 24/7. What exactly we’re supposed to do with this information is still up to debate, as Twitter has yet to turn it into a profitable service after over a decade of operation. Still, it’s a grand experiment that offers a rare glimpse into the human hive-mind for anyone brave enough to dive in.

One such explorer is a security researcher who goes by the handle [x0rz]. He’s recently unveiled an experimental new piece of software that grabs Tweets and uses them as a “noise” to mix in with the Linux urandom entropy pool. The end result is a relatively unpredictable and difficult to influence source of random data. While he cautions his software is merely a proof of concept and not meant for high security applications, it’s certainly an interesting approach to introducing humanity-derived chaos into the normally orderly world of your computer’s operating system.

Noise sampling before and after being merged with urandom

This hack is made possible by the fact that Twitter offers a “sample” function in their API, which effectively throws a randomized collection of Tweets at anyone who requests it. There are some caveats here, such as the fact that if multiple clients request a sample at the same time they will both receive the same Tweets. It’s also worth mentioning that some characters are unusually likely to make an appearance due to the nature of Twitter (emoticons, octothorps pound signs, etc), but generally speaking it’s not a terrible way to get some chaotic data on demand.

On its own, [x0rz] found this data to be a good but not great source of entropy. After pulling a 500KB sample, he found it had an entropy of 6.5519 bits per byte (random would be 8). While the Tweets weren’t great on their own, combining the data with the kernel’s entropy pool at /dev/urandom provided something that looked a lot less predictable.

The greatest weakness of using Twitter as a source of entropy is, of course, the nature of Twitter itself. A sufficiently popular hashtag on the rise might be just enough to sink your entropy. It’s even possible (though admittedly unlikely) that enough Twitter spam bots could ruin the sample. But if you’re at the point where you think hinging your entropy pool on a digital fire hose of memes and cat pictures is sufficient, you’re probably not securing any national secrets anyway.

(Editor’s note: The way the Linux entropy pool mixes it together, additional sources can only help, assuming they can’t see the current state of your entropy pool, which Twitter cats most certainly can’t. See article below. Also, this is hilarious.)

It’s the content of the Tweets that is being used to feed the entropy pool. So long as you aren’t just getting the same ones every time, it doesn’t matter which ones you get.

Honestly, at the rate new Tweets must pour into the system on a global scale, it would probably be enough to just take the last ~100 Tweets or whatever at any given time and you’d know there’s little chance anyone else got the same set.