Raspberry Pi bot tracks hacker posts to vacuum up passwords and more

Dumpmon scours Twitter for sensitive data hiding in plain sight.

Password and credit-card details leak online every day. So no one really knows just how much personally identifiable information is available by clicking on the right link to Pastebin, Pastie, or similar sites. Using a platform that runs on the hobbyist Raspberry Pi platform to drink from this fire hose, a security researcher has cataloged more than 3,000 such posts in less than three months while adding scores more each week.

Dumpmon, as the project is called, is a bot that monitors Twitter messages for Web links containing account credentials, sensitive account information, and other "interesting" content. Since its debut on April 3, it has captured more than 3,300 records containing 1.1 million addresses, most of which are accompanied by the plaintext or cryptographic hash of an associated password. The project has also unearthed social security and driver license numbers, credit card data, and other information that could be used to hijack user accounts or commit identity theft. On average, Dumpmon collects 51 such posts each day.

"It was mainly trying to determine how much information is being hidden from plain view and finding out how much information can be found just by looking in the right place," said Jordan Wright, a security engineer for CoNetrix. (Wright created the Dumpmon as an independent side project.) "It's pretty incredible. I wasn't expecting as much information as I found. I was expecting a lot less for sure."

The "dumps," as the online data postings are called, are frequently published to embarrass the victims or as a means for hacking crews to demonstrate their prowess to rivals. Often, dumps are advertised on Twitter or another social networking site with a line or two of vague or cryptic text and a link. In the span it takes to comb through one such posting, a half-dozen or more additional dumps may be posted. The frequency makes it hard for outsiders to keep tabs.

Of the 620 records Wright has analyzed in depth, the researcher recovered 174,423 e-mail addresses accompanied by a hashed or plaintext password. With so many sites using e-mail addresses as account user IDs, the data often gives attackers all they need to access multiple accounts maintained by a victim. In the event the owner has used the same address and password to secure other accounts—or even the e-mail address itself—attackers can reuse the credentials to hijack those as well. Of the 174,423 e-mail addresses Wright analyzed in depth, more than 120,000 of them were accompanied by a plain-text password. The remaining passwords were expressed as cryptographic hashes, which are frequently trivial to crack.

Account credentials are by no means the only valuable data included in these postings. The 620 records Wright analyzed for this article also contained what appeared to be valid data for 1,496 payment cards. In many cases, data collected by Dumpmon included bank account numbers and home addresses. Other files observed by Ars included social security and driver license numbers, first and last names, addresses, and medical diagnoses contained on health records. Dumps also contained passwords stored on computers that had been infected by malware.

"These full identity dumps are probably more of the higher commodity item," Wright said of the records containing social security numbers, names, and addresses. "As far as why these were dumped for free, that's the answer I'm looking for: Why people are giving this information out?"

Some of the data—for instance, a recent dump posted to Pastebin that Ars will not link to—appears to be derived from browsers that were configured to store frequently used account IDs and passwords. When the computers are infected with malware, the credentials are dumped to a file that later gets posted online. The discoveries led Wright to publish a post documenting how Google Chrome, Internet Explorer, and other browsers store passwords. Incidentally, Wright concluded users shouldn't trust their passwords to these storage systems, but I'm not so sure. Any computer that is infected with malware that provides a backdoor onto the system is already vulnerable to wholesale password theft. In fairness to Wright, the sensitive details may be easier or quicker to gather en masse when they're stored in a browser.

Other dumps cataloged by Dumpmon included private SSH encryption keys used to administer websites, configuration files for Cisco routers, and logs from successful malware infections.

To keep things interesting, Dumpmon has been designed to run on the Raspberry Pi platform.

"The goal was to find a happy balance between both obtaining new pastes from the different sites, as well as processing the existing pastes in the queue to determine if they are interesting," Wright said. "This created challenges, since the Raspberry Pi has limited hardware capability and I was monitoring for quite a few things."

Because posts on Pastebin and other sites are often taken down by the original poster or site administrators, Dumpmon also copies and stores the contents of each one. While Wright has published the underlying code for anyone to use, he said he makes the cached data available only to white hat researchers.

"I don't want to make it easier for the wrong people," he explained. "My goal was as best as I could only give it to people who will use it responsibly."

I read the article because of my interest in RaspPi, and it took me a while to think of a reason why the researcher used a Pi for this experiment (other than pure hackjoy and cheap hardware). If you are going to leave a computer run for 3 months, something that only pulls 3 watts seems like a good start.

"These full identity dumps are probably more of the higher commodity item," Wright said of the records containing social security numbers, names, and addresses. "As far as why these were dumped for free, that's the answer I'm looking for: Why people are giving this information out?"

In many cases:A dump is simply revenge for a real or perceived slight of some sort.Personal info being dumped to Pastebin, etc. and linked to by Twitter is a daily occurrence.Hack Forums/XBox script kiddies delight in dumping the personal information of some kid, and often their family, who either beat them at Call of Duty, or used a booter service to DDoS them off of their XBox Live account or owned them in a hack forum by making them look dumb in front of their peers.These kids aren't looking to sell the information, they just dump it in hopes of wreaking havoc on their targets. Is it illegal? Of course it is, but a sociopathic fifteen year old gamer with no parental supervision rarely "thinks" before they hit the "enter" key to drop their adversaries information.These kind of "dumps" are the "middle finger" of the 21st Century script kiddie.Could they sell this info on "carder" and personal ID sites? Yes, but the truth is, they stand to make, maybe, $5 for a complete dump with an SSN, and as dumb as they are, even a skid realizes that the risk vs. reward simply isn't worth it.

I read the article because of my interest in RaspPi, and it took me a while to think of a reason why the researcher used a Pi for this experiment (other than pure hackjoy and cheap hardware). If you are going to leave a computer run for 3 months, something that only pulls 3 watts seems like a good start.

It seems like 90% of the Pi projects use it for the 'pure hackjoy'.

I can sympathize as I've already fallen into the trap of buying a shiny little modular linux board & not being creative enough to think of a good project for it. So instead I just used it for a regular project "to keep things interesting."

Looking at the author's blog post and the code, there isn't any mention of running it on a Raspberry Pi. In other words, it doesn't seem like the author of Dumpmon thought it was that interesting or special.

Sure it's cute to think of a dedicated 3-5 watt single purpose "robot" out there "vacuuming" up the internet but any computer connected to the internet can have this running in the background.

one of my students was doing this a couple years ago. Pastebin and other sights list the last 10 pastes to them. By scraping the main page, you can get all the page addresses. Its amazing what people will put on pastebin.

Could they sell this info on "carder" and personal ID sites? Yes, but the truth is, they stand to make, maybe, $5 for a complete dump with an SSN, and as dumb as they are, even a skid realizes that the risk vs. reward simply isn't worth it.

Could they sell this info on "carder" and personal ID sites? Yes, but the truth is, they stand to make, maybe, $5 for a complete dump with an SSN, and as dumb as they are, even a skid realizes that the risk vs. reward simply isn't worth it.

Could they sell this info on "carder" and personal ID sites? Yes, but the truth is, they stand to make, maybe, $5 for a complete dump with an SSN, and as dumb as they are, even a skid realizes that the risk vs. reward simply isn't worth it.

The NSA is different. They go through massive amounts of information that is supposed to be private and do it via less ethical means, such as tapping into phone calls and monitoring internet connections.

This guy was simply using a program to grab a bit of publicly available data that anyone could have accessed had they clicked the right links. Unlike the NSA, his goal wasn't to use the information he found, but rather to make statistics on the quantity of data he found.