Field notes and occasional musings by Peter on Stuff that happens, from a free software perspective, mainly OpenBSD, FreeBSD.

Saturday, October 5, 2013

The Hail Mary Cloud And The Lessons Learned

Against ridiculous odds and even after gaining some media focus, the botnet dubbed The Hail Mary Cloud apparently succeeded in staying under the radar and kept compromising Linux machines for several years. This article, based on my BSDCan 2013 talk, sums up known facts about the botnet and suggests some common-sense measures to be taken going forward.

The Hail Mary Cloud was a widely distributed, low intensity password guessing botnet that targeted Secure Shell (ssh) servers on the public Internet.

The first activity may have been as early as 2007, but our first recorded data start in late 2008. Links to full data and extracts are included in this article.

We present the basic behavior and algorithms, and point to possible policies for staying safe(r) from similar present or future attacks.

But first, a few words about the devil we knew before the incidents that form the core of the narrative.

The Traditional SSH Bruteforce Attack

If you run an Internet-facing SSH service, you have seen something like this in your logs:

This is the classic, rapid-fire type of bruteforce attack, with rapid-fire login attempts from one source. (And yes, skapet is the Internet-facing host on my home network.)

The Likely Business Plan

These attempts are often preceded by a port scan, but in other cases it appears that the miscreants are just blasting away at random. In my experience, with the gateway usually at the lowest-numbered address, the activity usually turns up first there, before trying higher-numbered hosts. I'm not really in a mind to offer help or advice to the people running those scripts, but it might be possible to scan the internet from 255.255.255.255 downwards next time. Anyway, looking at the log excerpts, miscreants' likely plan is

But then the attempts usually come in faster than most of us can type, so with a little help from toolmakers, we came up with an inexpensive first line of defense, easily implemented in perimeter packet filters (aka firewalls).

Traditional Anti-Bruteforce Rules

Rapid-fire bruteforce attacks are easy to head off. I tend to use OpenBSD on internet facing hosts, so first we present the technique as it has been available in OpenBSD since version 3.5 (released in 2005), where state tracking options are used to set limits we later act on:

In your /etc/pf.conf, you add a table to store addresses, block access for all traffic coming from members of that table, and finally amend your typical pass rule with some state tracking options. The result looks something like this:

Here, max-src-conn is the maximum number of concurrent connections allowed from one host

max-src-conn-rate is the maximum allowed rate of new connections, here 15 connections per 5 seconds.

overload <bruteforce> means that any hosts that exceed either of these limits are have their adress added to this table

and, just for good measure, flush global means that for host that is added to our overload table, we kill all existing connections too.

Basically, problem solved - the noise from rapid-fire bruteforcers generally disappears instantly or after a very few attempts. If you are about to implement something like this (and many do -- the bruteforcer section in my PF tutorial appears to be among the more popular ones), you probably need to watch your logs to find useful numbers for your site, and tweak rules accordingly. I have yet to meet an admin who plausibly claims to never have been tripped up by their overload rules at some point. That's when you learn to appreciate having an alternative way in to your systems, such as a separate admin network.

Traditional Anti-Bruteforce Rules, Linux Style

For those not yet converted to the fine OpenBSD toolset (available in FreeBSD and other BSDs too, with only minor if any variations in details for this particular context), the Linux equivalent would be something like

But be warned: this will still be minus the maximum number of connections limit, plus the usual iptables warts. And you'd need a separate set of commands for ip6tables.

It's likely something similar is doable with other tools and products too, including possibly some proprietary ones. I've made something of an effort to limit my exposure to the non-free tools, so I can't offer you any more detail. To find out what your present product can do, please dive into the documentation for whichever product you are using. Or come back for some further OpenBSD goodness.

But as you can see, for all practical purposes the rapid-fire bruteforce or floods problem has been solved with trivial configuration tweaks.

But then something happened.

What's That? Something New!

On November 19th, 2008 (or shortly thereafter), I noticed this in my authentication logs:

... and so on. The alphabetic progression of user names went on and on.

The pattern seemed to be that several hosts, in widely different networks, try to access our system as the same user, up to minutes apart. When any one host comes back it's more likely than not several user names later. The full sequence (it stopped December 30th), is available here.

Take a few minutes to browse the log data if you like. It's worth noting that rosalita was a server that had a limited set of functions for a limited set of users, and basically no other users than myself ever logged in there via SSH, even if they for various reason had the option open to them. So in contrast to busier sites where sequences like this might have drowned in the noise, here it really stood out. And I suppose after looking at the data, you can understand my initial reaction.

The Initial Reaction

My initial reaction was pure disbelief.

For the first few days I tried tweaking PF rules, playing with the attempts/second values and scratching my head, going, "How do I make this match?"

I spent way too much time on that, and the short version of the answer to that question is, you can't. With the simple and in fact quite elegant state tracking options, you will soon hit limits (especially time limits) that interfere with normal use, and you end up blocking legitimate traffic.

So I gave up on prevention (which really only would have rid me of a bit of noise in my authentication logs), and I started analyzing the data instead, trying to eyeball patterns that would explain what I was seeing. After a while it dawned on me that this could very well be a coordinated effort, using a widely distributed set of compromised hosts.

So there was a bit of reason in there after all. Maybe even a business plan or model.
Next, I started analyzing my data, and came up with -

Bruteforcer Business Plan, Distributed Version

The Executive Summary would run something like this: Have more hosts take turns, round robin-ish, at long enough intervals to stay under the radar, guessing for weak passwords.

The plan is much like before, but now we have more host on the attacking side, so

Pick a host from our pool, assign a user name and password (picked from a list, dictionary or pool)

For each host,

Try logging in to the chosen target with the assigned user name and password

If successful, report back to base (we theorize); else wait for instructions (again we speculate)

If your password is weak, you will be 0WN3D, sooner rather than later.

There's a whole fleet out there, and they're coordinated.

At this point I thought I had something useful, so I started my first writeup for publication. I had just started a new job at the time, and I think I mentioned the oddities to some of my new colleagues (that company is unfortunately defunct, but the original linked articles give some information). Anyway, I wrote and published, hoping to generate a little public attention for myself and my employer. And who knows, maybe even move a few more copies of that book I'd written the year before.

Initial Public Reaction

On December 2, 2008, I published the first blog post in what would become a longish sequence, A low intensity, distributed bruteforce attempt, where I summarized my findings. It's slightly more wordy than this piece, but if I've piqued your interest so far, please go ahead and read. And as to a little public attention, I got my wish. The post ended up slashdotted, the first among my colleagues to end up with their name on the front page of Slashdot.

That brought

more disbelief, (see slashdot and other comments) but also

confirmation, via comments and email, that others were seeing the same thing, and that the first occurrences may have been seen up to a year earlier (November-ish 2007).

The slow bruteforcers were not getting in, so I just went on collecting data. I estimated they'd be going on well past new year's if they were going to reach the end of the alphabet.

On December 30th, 2008, The Attempts Stopped

The attempts came to an end, conveniently while I was away on vacation. The last entries were:

The slashdot story brought comments and feedback, with some observations from other sites. Not a lot of data, but enough that the patterns we had observed were confirmed. The attempts were all password authentication attempts, no other authentication methods attempted.

For the most part the extended incident consisted of attempts on an alphabetic sequence of 'likely' user names, but all sites also saw at least one long run of root only attempts. This pattern was to repeat itself, and also show up in data from other sources.

There would be anything from seconds to minutes between attempts, but attempts from any single host would come at much longer intervals.

First Round Observations, Early Conclusions

Summing up what we had so far, here are a few observations and attempts at early conclusions.

At the site where I had registered the distributed attempts, the Internet-reachable machines all ran either OpenBSD or FreeBSD. Only two FreeBSD boxes were contacted.

The attackers were hungry for root, so having PermitRootLogin no in our sshd config anywhere Internet facing proved to be a good idea.

We hadn't forced our users to keys only, but a bit of luck and John the Ripper (/usr/ports/security/john) saved our behinds.

Starting with 2318 attempts at root before moving on to admin and proceeding with the alphabetic sequence. The incident played out pretty much like the previous one, only this time I was sure I had managed to capture all relevant data before my logs were rotated out of existence.

The data is available in the following forms: Full log here, one line per attempt here, users by frequency here, hosts by frequency here.

The dt_ssh5 file was found installed in /tmp on affected systems. The reason our perpetrators chose to target that directory is likely because the /tmp directory tends to be world readable and world writeable.

Again, this points us to the three basic lessons:

Stay away from guessable passwords

Watch for weird files (stuff you didn't put there yourself) anywhere in your file system, even in /tmp.

Internalize the fact that PermitRootLogin yes is a bad idea.

dt_ssh5: Basic Algorithm

The discovery of dt_ssh5 made for a more complete picture. A rough algorithm suggested itself:

Pick a new host from our pool, assign a user name and password

For each host,

Try user name and password

if successful

drop the dt_ssh5 binary in /tmp; start it

report back to base

else wait for instructions

Go to 1.

For each success at 2.2, PROFIT!

I never got myself a copy, so the actual mechanism for communicating back to base remains unclear.

The Waves We Saw, 2008 - 2012

We saw eight sequences (complete list of articles in the References section at the end),

The 2009-09-30 sequence was notable for trying only root, the 2012-04-01 sequence for being the first to attempt access to OpenBSD hosts.

We may have missed earlier sequences, early reports place the first similar attempts as far back as 2007.

For A While, The Botnet Grew

From our point of view, the swarm stayed away for a while and came back stronger, for a couple of iterations, possibly after tweaking their code in the meantime. Or rather, the gaps in our data represent times when it focused elsewhere.

Clearly, not everybody was listening to online rants about guessable passwords.

For a while, the distributed approach appeared to be working.

It was (of course) during a growth period I coined the phrase "The Hail Mary Cloud".

Instantly, a myriad of "Hail Mary" experts joined the insta-punditry on slashdot and elsewhere.

It Went Away Or Dwindled

Between August 2010 and October 2010, things either started going badly for The Hail Mary Cloud, or possibly they focused elsewhere.

I went on collecting data.

There wasn't much to write about, except possibly that the botnet's command and control was redistributing effort based on past success. Aiming at crackable hosts elsewhere.

And Resurfaced In China?

Our last sighting so far was in April 2012. The data is preserved here.

This was the first time we saw Hail Mary Cloud style attempts at accessing OpenBSD systems.

The majority of attempts were spaced at at least 10 seconds apart, and until I revisited the data recently, I thought only two hosts in China were involved.

In fact, 23 hosts made a total of 4757 attempts at 1081 user IDs, netting 0 successful logins.

The question anybody reading this far will be asking is, what should we do in order to avoid compromise by the password guessing swarms? To my mind, it all boils down to common sense systems administration:

Mind your logs. You can read them yourself, or train a robot to. I use logsentry, other monitoring tools can be taught to look for anomalies (failed logins, etc)

Keep your system up to date. If not OpenBSD, check openssh.org for the latest version, check what your system has and badger the maintainer if it's outdated.

And of course, configure your applications such as sshd properly -

sshd_config: 'PermitRootLogin no' and a few other items

These two settings in your sshd_config will give you the most bang for the buck:

PermitRootLogin no
PasswordAuthentication no

Make your users generate keys, add the *.pub to their ~/.ssh/authorized_keys files.

For a bit of background, Michael W. Lucas: SSH Mastery (Tilted Windmill Press 2013) is a recent and very readable guide to configuring your SSH (server and clients) sensibly. It's compact and affordable, too, and you can even buy it via the OpenBSD site.

Keep Them Out, Keep Them Guessing

At this point, most geeks would wax lyrical about the relative strengths of different encryption schemes and algorithms.

Being a simpler mind, I prefer a different metric for how good your scheme is, or effectivness of obfuscation (also see entropy):

Port knocking usually means having all ports closed, but with a daemon reading your firewalls logs for a predetermined sequence of ports. Knock on the correct ports in sequence, your're in.

Another dirty little secret: It's possible to implement port knocking with only the tools in an OpenBSD base system. No, I won't tell you how.

Executive Summary: Don't let this keep you from keeping your system up to date.

To my mind port knocking gives you:

Added complexity or, one more thing that will go wrong. If the daemon dies, you've bricked your system.

An additional password that's hard to change. There's nothing magical about TCP/UDP ports. It's a 16 bit number, and in our context, it's just another alphabet. The swarm will keep guessing. And it's likely the knock sequence (aka password) is the same for all users.

You won't recognize an attack until it succeeds, if even then. Guessing attempts will be indistinguishable from random noise (try a raw tcpdump of any internet-facing interface to see the white noise you mostly block drop anyway), so you will have no early warning.

Port knocking proponents seem to have sort of moved on to single packet authentication instead, but even those implementations still contain the old port knocking code intact.

If you want a longer form or those arguments, my November 4, 2012 rant Why Not Use Port Knocking? was my take (with some inaccuracies, but you'll live).

There's No Safety In High Ports Anymore

Another favorite suggestion is to set your sshd to listen on some alternate port instead of the default port 22/TCP.

People who did so have had a few years of quiet logs, but recent reports show that whoever is out there have the resources to scan alternate ports too.

Once again, don't let running your sshd on an alternate port keep you from keeping your system up to date.

Reports with logs trickle in from time to time of such activity at alternate ports, but of course on any site with a default deny packet filtering policy will not see any traces of such scans unless you go looking specifically at the mass of traffic that gets dropped at the perimeter.

Final thoughts, for now

Microsoftish instapundits were quick to assert that ssh is insecure.

They're wrong. OpenSSH (which is what essentially everyone uses) is maintained as an integral part of the OpenBSD project, and as such is a very thoroughly audited mass of code. And it keeps improving with every release.

I consider the Hail Mary Cloud an example of distributed, parallel problem solving, conceptually much like SETI@Home but with different logic and of course a more sinister intent.

Computing power is cheap now, getting cheaper, and even more so when you can leverage other people's spare cycles.

The huge swarm of attackers concept is as I understand it being re-used in the recent WordPress attacks. We should be prepared for swarm attacks on other applications as soon as they reach a critical mass of users.

There may not be a bullseye on your back yet (have you looked lately?), but you are an attractive target.

Fortunately, sane system administration practices will go a long way towards thwarting intrusion attempts, as in

And in other news, it appears that GitHub has been subject to an attack that matches the characteristics we have described. A number of accounts with weak passwords were cracked. Investigations appears to be still ongoing. Fortunately, GitHub appear to have started offering other authentication methods.

UPDATE 2014-09-28: Since early July 2014, we have been seeing similar activity aimed at our POP3 service, with usernames taken almost exclusively from our spamtrap list. The article Password Gropers Take the Spamtrap Bait has all the details and log data as well as references to the spamtrap list.

UPDATE 2016-08-10: The POP3 gropers never went away entirely and soon faded into a kind of background noise. In June of 2016, however, they appeared to have hired themselves out to a systematic hunt for Chinese user names. The article Chinese Hunting Chinese over POP3 in Fjord Country has further details, and as always, links to log data and related files.

There is a tool called denyhosts that solves your problem trivially. I myself wrote similar tool in awk and shell and used it for many years. It is funny how same problem comes up over and over again and people look for their own solution and analysis instead of using existing tools.

Sorry, there's no useful reason to allow remote root logins on *ANY* server, whether it's facing the Intardwebz or not. Any password you can come up with can probably be broken, it may just take some effort. Why open yourself to the risk? There's no benefit to it whatsoever. Sudo is your friend. Learn it, use it, love it.

I used to create an alternate root account and ssh altroot@, with password. I now use sudo. Using the formula above, security is measured in how many bytes the attackers have to get exactly right. It turns out that renaming the root account (to a username as long as your non-root account) is just as secure as signing on as yourself and using sudo; after all, you're just typing the same password over again at the sudo prompt.

I would switch to ssh keys only, except that I want to be able to sign on in the event I don't have my ssh key with me.

We should log what password these slow-brutes are using, because they're effectively broadcasting the ip addresses, usernames and passwords of all of the hosts in their botnet.

I've got a SSH server on a different port + check the logs daily, and in the last 5 years of running it, I can't think of a time that we've had anyone trying to connect to it. The other ssh server (chrooted/sftponly) on the standard port on that machine gets lots of attempts though. Perhaps that's part of the solution? - what amounts to a honey pot on a standard port and the actual SSH server somewhere else?

A little thing that helped me, since I'm administrating my servers from a randomly changing dialin IP:Only allow IPs access to your services, which you are on. Create an iptables entry for that that updates regulary (wrote myself a script for that).

For that, use dyndns of any kind. Of course, people like the NSA or other bigbiz could still force companies (if you use a commercial service) into changing your dyndns - ip into something of their preference. But even then all they have is a normal SSH-Login with pubkey authentication and "PermitRootLogin No". AND you'd instantly know if something went astray, if you suddenly can't access your servers via SSH (or other services like eg. NRPE) anymore.

In short: Allow access to services on your public IP addresses only to known-probably-ok IP addresses, and most of your botnet-related problems are gone. Of course there are still exploits out there, so you are not completely in the green. Good luck :)

I use port knocking via my Mikrotik router to allow access to port 22. I know your dislike of Port Knocking, but it pretty much stops these random IP scans. You really need multiple layers of security to protect your servers, and its best if all that traffic is stopped before it even reaches your server.

With my setup, you have to hit random ports with specific data (the router does Layer 7 inspection) on 5 different non-sequential ip addresses on my subnet all within 1 second to enable access. Hit any other ip/port with any packet of data, and you are going to be starting over. This of course then just gives you access to the NATed port (not 22) to ssh to my servers (key only, no pass auth)

There doesn't have to even be ports or even servers actually listening on those IPs, its all handled in the router itself. I then have rules in place that will black hole your IP for 24 hours if you attempt to brute force it.

Hey Peter... I've been seeing these style attacks in my logs as well... also running OpenBSD and started noticing them back when you wrote that first article! They are still there, plugging away day after day.

Keys are good, but only as good as your endpoint security. If the user doesn't password-protect his key, and his laptop gets stolen, the thief/attacker is in unless the admin gets notified in time and revoked the key. If the user does password his key, there's no way to verify password strength without some sort oglf agent running on the laptop.

On the other hand, brute-forcing s key passsword may be harder due to lack of plain text in the key file (or maybe not, I don't know). And I suppose it's still less risk than brute-forcers over the net.

I also saw the same slow distributed attacks on my OpenBSD box back in 2008. I always wanted to reply back to these attacks. One day I tried to log into one of those by guessing the login/pass. It took me few tries to get admin/admin. Then I deleted the executable in /tmp and killed the daemon in memory. I also changed the admin password before logging out.

It gives me the idea to write a daemon to do this automatically based on wrong usernames (the ones not in passwd) or overload table. I never managed to get the motivation to write this, but it could have been an interesting project, put aside the illegal nature of this project.

Note: Comments are moderated. On-topic messages will be liberated from the holding queue at semi-random (hopefully short) intervals.

I invite comment on all aspects of the material I publish and I read all submitted comments. I occasionally respond in comments, but please do not assume that your comment will compel me to produce a public or immediate response.

Please note that comments consisting of only a single word or only a URL with no indication why that link is useful in the context will be immediately recycled so those poor electrons get another shot at a meaningful existence.

If your suggestions are useful enough to make me write on a specific topic, I will do my best to give credit where credit is due.

About Me

Puffyist, daemon charmer, penguin wrangler. Wrote The Book of PF (3rd ed out now, see http://www.nostarch.com/pf3), rants on sanity in IT (lack of) at http://bsdly.blogspot.com/. Please read http://www.bsdly.net/~peter/rentageek.html before contacting.