Tuesday, December 4, 2012

Lessons from the S.C. breach

In October, the South Carolina Department of Revenue discovered that it had been breached and contacted Mandiant to assist in the investigation and response. All told, millions of social security numbers and hundreds of thousands of bank/credit card numbers had been stolen.

In November, Mandiant published their findings. This is exciting. All we usually get is a news article lacking in technical detail. This we can actually learn from.

My goal in this blog post is to explore what, in hindsight, the S.C. Department of Revenue could or should have done better. Please read the Mandiant report before you move on.

What Happened?

Mandiant published a summary by date. I'm going to further condense that and label the events from Day 1 to Day 66 so that it's easier to calculate the time elapsed between events. This is all from the report. It's their work, not mine. If you didn't read their report, go do it now.

Day 1) The attack started with a phishing email targeted at multiple employees. The email contained a link to a malicious program that could steal a user's credentials.

Day ??) One of the employees (or his/her computer) gave up the goods and the phishing attack was successful.

Day 15) The attacker used a username and password to log in to a remote access service, connected to the user's machine and then accessed other systems and databases.

Day 17) The attacker stole additional passwords from six computers.

Day 20) The attacker stole passwords for "all Windows accounts" and installed a backdoor on one server.

Day 21 - Day 30) The attacker accessed approximately 38 systems using one or more compromised accounts; the report doesn't say explicitly whether it was the same account used earlier. My impression is that it was not. The attacker performed recon on several of these days. The attacker also authenticated to a couple of web servers but didn't accomplish anything.

Day 31 - Day 33) Copied database backup files and sent those files over the Internet.

Day 34) Interacted with ten systems. More recon.

Day 35- Day 65) Nothing happens.

Day 66) Attacker(s) check on the backdoor they installed.

Analysis

The initial phish (Day 1 to Day ??)

The gap between the initial phishing attack and the first time the attacker actually used a user's credentials was two weeks. Presumably the user(s) who responded did so within a few days. I'm not sure why the attacker waited to use them. The attacker could have performed additional reconnaissance from the outside or been busy on another project.

So, what could the Department of Revenue have done to stop this? What wouldn't have helped?

If the phishing email were reported (e.g. by a wary recipient) or otherwise detected during this window, the attack might have been stopped. The IT/security staff could have cleaned or re-imaged the affected machines and forced the affected users change their passwords. If better education/awareness would have prompted just one of the targeted users to report this, it would have been worthwhile.

If this malware was custom, and I'm guessing it was, AV software would have been little to no use.

If the malware stole passwords by dumping the Windows credential cache, this could have been prevented by not giving users local administrator privileges. It's also possible that the malware retrieved the username and password some other way and local admin privileges were not an issue.

If they used two-factor authentication, it would have been much more difficult for the attacker to pull this off. With one time passwords, the attacker would have had to steal the one-time password and use it immediately, probably via more sophisticated malware. The attacker could also tried to deliver a malware package that could stay resident on the user's system and phone home to give the attacker access. This would have been more complicated and easier to detect but still very possible.

Initial Access (Day 15)

Why are remote users allowed to login to the internal network with just a username and password? Even if two-factor authentication wasn't used internally, remote users should have been forced to connect through a VPN using separate authentication (e.g. digital certificates).

Password Stealing and Backdoors (Day 17-20)

The report doesn't give a lot of details here. How did the attacker steal additional passwords? The attacker probably gained admin privileges at some point. The "all Windows passwords" reference probably means he got domain admin rights and dumped everything on Day 20.

Assuming the attacker got admin rights, how did he do it? Was the first victim an admin? Did the attacker escalate using an exploit? Was a patch available? What tools did he use to steal passwords? Did the attacker use publicly available tools such as creddump or pwdump? If one or more of the internal systems were unpatched, the department could have prevented this by establishing a stronger policy and procedure for patching. If publicly available tools were used to dump passwords, why didn't the AV or endpoint security products detect them? Was the attacker able to disable the anti-virus first? Could this be detected by centralized AV management? Unfortunately, I have more questions than answers.

The report indicates that this was (at least partially) a Windows network. One of the problems with Windows networks is that the network authentication and password hashing is awful. The NTLM protocols have various problems and can allow attackers to brute force credentials after observing network authentication. The hashes use MD4 which is incredibly fast and unsalted so attackers can guess billions of passwords per second. And, to top it all off, the hashes are the actual secret used for authentication (not the password). Attackers can "pass the hash" to log in without knowing the associated password.

We really need better options for enterprise network authentication. It's really unfortunate that Microsoft hasn't offered anything stronger. SRP with a strong password hash like bcrypt would have helped out a lot here. The attacker wouldn't be able to pass the hash and it would be very difficult to crack the hashes/verifiers.

Snooping Around (Day 21 - Day 30)

The attacker was pretty busy poking at different systems for more than a week. He accessed dozens of servers and performed "reconnaissance" several times. Did he scan and fingerprint the network with nmap? Did he just poke around at the network shares available to him? How noisy was the attacker?

This is probably where a NIDS could have detected the attacker's activity. I've knocked IDS in the past because in many cases it's just too easy to bypass. But here, it may have been the right tool for the job. Network session or statistical data would have been awesome too, assuming someone is actually keeping watch. Perhaps Richard Bejtlich can convince them to start capturing full content.

Did the Department of Revenue have IDS alerts, session data or statistical data? Was anyone looking at it? Are various data sources properly correlated?

Paydirt (Day 31 - Day 33)

After 17 days on the network, the attacker made a copy of some database backup files that totaled 74.7 GB of uncompressed data. These backups came from three different systems. According to the report, some of this data was encrypted and some wasn't. Why wasn't it all encrypted? Was some of the information less sensitive?

At this point, it's game over. The attacker has what he wants and is shipping it home.

Password Expiration?

I've blogged before about password expiration. Many people argue that expiration is useful for limiting an attacker's access. I think this breach is a good example of why it just isn't so. The time elapsed between the first time the attacker logged in and the point at which he started copying the database backups was 17 days. With a password expiration of 60 days or longer, the attacker would probably not have been locked out of the initial account he compromised before he had accomplished his objective.

With an expiration of 30 days, the odds are better than even that the password would have changed before he finished. But, within 5 days of his initial login, the attacker had stolen hashes from several servers and at least one domain. He had access to many accounts; hundreds, maybe thousands. The attacker also installed a backdoor on at least one system which implies that he had administrator rights. In order for expiration to stop him, he would have had to be dependent on the first compromised account with little or no ability to expand his access to additional accounts or systems.

It's also worth noting that the attacker accessed over three dozen systems in the 15 days after his initial access which means he had the opportunity to cause a lot of damage before any potential expiration would have kicked in, even with an aggressive 30 day policy.

On the other hand, two-factor authentication would have made it much more difficult for the attacker to gain access initially and to gain access to additional accounts once he had access. Stronger password hashing and network authentication would also have made it much more difficult to gain access to additional accounts.

Detection

If the Department of Revenue had detected the initial phishing attempt, this whole sequence would have been disrupted (although the attacker could have tried again). If they had detected the attacker at any point during the first 15 days that he had access, the impact could have been greatly reduced. By the time they contacted Mandiant, the attacker was done.

Remediation Summary

I identified several possibilities for what the Department of Revenue could have done differently (and should do in the future).

There are a few recommendations that I'm (fairly) confident about:

Implement two-factor authentication inside the network or for critical systems

Require remote users to authenticate to a VPN using digital certificates.

Educate users about phishing.

Encourage users to report phishing attempts.

Make it easy for them.

Follow-up on the reports

Implement a network intrusion detection system

Or manage the existing one better.

Capture and record session and statistical data

Correlate logs, IDS alerts and network data

There are also a few recommendations that may or may not be applicable:

I'd like to express my thanks to Mandiant and/or the South Carolina Department of Revenue for publishing this report. I think there's a lot to be learned from these reports, especially for those of us who are not actively working in incident response.

Final Note

I may update this post in the next few days if I think of anything else. I'll add a note here at the bottom to identify any major changes.