Recommended Posts

I've spent some time thinking about fun software projects to do in my spare time when I was inspired by the spam flooding my inbox. I think I want to try to make my own spam filter classifier. It probably won't be as good as others you can find online, but I think it might be a good way to learn about bayesian networks.
In order to try a classifier, I need a large corpus of spam emails. Does anyone know of a site I can go to to sign myself up for a boatload of spam? I want to set up about 20 honeypot gmail accounts to do nothing but collect the latest spam and then use one of those libraries that lets you use gmail as a disk to get the spams and mount a realtime defense. Probably won't get this far, but I like the idea.

Share this post

Link to post

Share on other sites

One way to increase the amount is to set up an inbox that collects all of the emails delivered to a particular domain, if you can. I receive a fair amount of spam (100 emails/day or so), the majority of which isn't addressed to example@example.com but instead to, say, john@example.com.

[Edited by - benryves on March 5, 2010 9:46:33 AM]

Share this post

Link to post

Share on other sites

If you post the email addresses on a public forum like this, you'll start getting spam quite soon. There are email harvesters that check sites like this. Posting the address on Usenet will REALLY do the trick.Once you have some spam, click on all of them. The next day you'll have ten times as much.

Share this post

Link to post

Share on other sites

If you can tell me how to export selected email in thunderbird, I have 1000+ in my spam folder right now (I get 50-100 spam messages a day, and it deletes them regularly enough that my junk folder stays at around 1000). If you own a domain, set up the postmaster@domain address and it'll get tons of spam instantly. Likewise for webmaster@ and less so for abuse@