With 9TB of data, survey is one of the most exhaustive—and illicit—ever done.

Share this story

In one of the more audacious and ethically questionable research projects in recent memory, an anonymous hacker built a botnet of more than 420,000 Internet-connected devices and used it to perform one of the most comprehensive surveys ever to measure the insecurity of the global network.

In all, the nine-month scanning project found 420 million IPv4 addresses that responded to probes and 36 million more addresses that had one or more ports open. A large percentage of the unsecured devices bore the hallmarks of broadband modems, network routers, and other devices with embedded operating systems that typically aren't intended to be exposed to the outside world. The researcher found a total of 1.3 billion addresses in use, including 141 million that were behind a firewall and 729 million that returned reverse domain name system records. There were no signs of life from the remaining 2.3 billion IPv4 addresses.

Continually scanning almost 4 billion addresses for nine months is a big job. In true guerilla research fashion, the unknown hacker developed a small scanning program that scoured the Internet for devices that could be logged into using no account credentials at all or the usernames and passwords of either "root" or "admin." When the program encountered unsecured devices, it installed itself on them and used them to conduct additional scans. The viral growth of the botnet allowed it to infect about 100,000 devices within a day of the program's release. The critical mass allowed the hacker to scan the Internet quickly and cheaply. With about 4,000 clients, it could scan one port on all 3.6 billion addresses in a single day. Because the project ran 1,000 unique probes on 742 separate ports, and possibly because the binary was uninstalled each time an infected device was restarted, the hacker commandeered a total of 420,000 devices to perform the survey.

More than nine terabytes of data

"A lot of devices and services we have seen during our research should never be connected to the public Internet at all," the guerilla researcher concluded in a 5,000-word report titled Internet Census 2012: Port scanning /0 using insecure embedded devices. "As a rule of thumb, if you believe that 'nobody would connect to the Internet, really nobody,' there are at least 1,000 people who did. Whenever you think 'that shouldn't be on the Internet but will probably be found a few times' it's there a few hundred thousand times. Like half a million printers, or a million Webcams, or devices that have root as a root password."

In all, the botnet, which the researcher named "Carna" after the Roman goddess of physical health, collected more than 9TB worth of data. It performed 52 billion ICMP ping probes, 180 billion service probe records, and 2.8 billion SYN scan records for 660 million IPs with 71 billion ports tested. The researcher said he took precautions to prevent his program from disrupting the normal operation of the infected devices.

"Our binaries were running with the lowest possible priority and included a watchdog that would stop the executable in case anything went wrong," he wrote. "Our scanner was limited to 128 simultaneous connections and had a connection timeout of 12 seconds."

He continued: "We used the devices as a tool to work at the Internet scale. We did this in the least invasive way possible and with the maximum respect to the privacy of the regular device users."

The researcher found that his scanning program wasn't the only unauthorized code hitching a free ride on some of the commandeered devices. Competing botnet programs such as one known as Aidra infected as many as 30,000 embedded devices including the Linux-powered Dreambox TV receiver and other devices that run on a MIPS hardware. The scanning software detected capabilities in Aidra that forced compromised devices to carry out a variety of denial-of-service attacks on targets selected by the malicious botnet operators.

"Apparently its author only built it for a few platforms, so a majority of our target devices could not be infected with Aidra," the researcher reported. "Since Aidra was clearly made for malicious actions and we could actually see their Internet scale deployment at that moment, we decided to let our bot stop telnet after deployment and applied the same iptable rules Aidra does, if iptables was available. This step was required to block Aidra from exploiting these machines for malicious activity."

The changes didn't survive reboots, however, allowing Aidra to resume control of the embedded devices once they were restarted. The scanning program was programmed to install itself on uninfected devices, so it's possible it may have repeatedly disrupted the malicious bot software only to be foiled each time a device was rebooted.

Enlarge/ Carna Botnet's 420,000-client distribution, March to December 2012.

Breaking the law

The research project almost certainly violated federal statutes prohibiting the unauthorized access of protected computers and possibly other hacking offenses. And since the unknown researcher is willing to take ethical and legal liberties in his work, it's impossible to verify that he carried out the project in the manner described in the paper. Still, the findings closely resemble those of HD Moore, the CSO of security firm Rapid7 and chief architect of the Metasploit software framework used by hackers and penetration testers. Over a 12-month period last year, he used ethical and legal means to probe up to 18 ports of every IPv4 Internet address three or four times each day. The conclusion: there are about 1.3 billion addresses that respond to various scans, with about 500 million to 600 million of them coming from embedded devices that were never intended to be reachable on the Internet.

Over three months in mid-2012, the researcher sent an astounding 4 trillion service probes, 175 billion of which were sent back and saved. In mid-December the researcher probed the top 30 ports, providing about 5 billion additional saved service probes. A detailed list of the probes sent to specific ports is here.

"This looks pretty accurate," Moore said of the guerilla report, which included a wealth of raw data to document the findings. "Embedded devices really are one of the most common devices on the Internet, and the security of these devices is terrible. I ran into a number of active botnets using those devices to propagate."

The only way to ultimately confirm the veracity of the findings is to go through the data in precise detail, which is something fellow researchers have yet to do publicly.

Moore said there were advantages and disadvantages to each of the studies. While use of an illicit botnet may have provided greater visibility into the overall Internet population, it amounted to a much briefer snapshot in time. Moore's approach, by contrast, was more limited since it probed just 18 ports. But because it surveyed devices every day for a year, its results are less likely to reflect anomalies resulting from seasonal differences in Internet usage.

Putting aside the ethical and legal concerns of taking unauthorized control of hundreds of millions of devices, the researcher builds a compelling case for taking on the project.

"We would also like to mention that building and running a gigantic botnet and then watching it as it scans nothing less than the whole Internet at rates of billions of IPs per hour over and over again is really as much fun as it sounds like," he wrote. What's more, with the advent of IPv6, the opportunity may never come again, since the next-generation routing system offers orders of magnitude more addresses that are impossible to be scanned en masse.

The researcher concluded by explaining the ultimate reason he took on the project.

"I did not want to ask myself for the rest of my life how much fun it could have been or if the infrastructure I imagined in my head would have worked as expected," he explained. "I saw the chance to really work on an Internet scale, command hundred thousands of devices with a click of my mouse, portscan and map the whole Internet in a way nobody had done before, basically have fun with computers and the Internet in a way very few people ever will. I decided it would be worth my time."

Promoted Comments

I see significant value in such research, if it could be authorized and conducted at scale. What this guy (and Moore) did reveals a great deal about the general health and security of the public Internet, at all levels. ISPs and other network owners and operators, private or otherwise, might even pay for the information - but probably shouldn't have to.

Think about the government-sponsored "Sam Knows" broadband study as an example, and then add the benefits of detecting major security issues, including the installation of actual malware, such as Aidra. It goes way beyond "what ports are open that probably shouldn't be?" Major benefits there! Ownership and/or governance are quite beyond me, but I like the idea.

I see significant value in such research, if it could be authorized and conducted at scale. What this guy (and Moore) did reveals a great deal about the general health and security of the public Internet, at all levels. ISPs and other network owners and operators, private or otherwise, might even pay for the information - but probably shouldn't have to.

There is the start of an authorized project of that nature at http://www.netdimes.org/new/ a distributed computing project that is working on mapping the internet and checks at least the ping response of all IPs. With additional help and funding I'm sure it could be expanded to include additional checks and information.

Arstechnica has a team that is contributing to the project. viewtopic.php?f=18&t=35620 if you want to join in and need any help, just ask us and we'll be nice, I promise!