Pages

Friday, January 06, 2006

There's been a lot of speculation that the "Bush spy plot" has to do with data mining -- sieving large amounts of data (such as all the email sent through a particular server) for specific information (such as the use of the word "hijack" or "bomb") that might be of interest to counter-terrorism operators. My impression is that while there may be data mining a la that supposedly discontinued Total Information Awareness program, there's probably more to it than that. But, nonetheless ... How hard is data mining, really?

It's not that hard at all, especially if you're the government and can waltz into any ISP, flash a badge,tap into all the incoming and outgoing traffic, and run it through the largest and most powerful collection of computers in the known cosmos.

What you may not know is that it can be easy for us apes, too ... because a lot of sites have features which -- with your permission and even active connivance -- make the data available on request. You just have to know how to request it in large quantities and how to analyze it once you've got it.

Hmm ... there are 414 Osamas with Amazon.Com wishlists. The article explains how to find out that much simply by plugging a first name into a URL. After that, it's a matter of some operations even I can grasp (but don't feel like investing the time in actually doing) to figure out where they live (state/city/ZIP, anyway) and if any of them have expressed a burning desire to own piloting manuals, explosives guides, etc.

Not difficult (it just took an expert to figure it out and tell us how to do it). If you and I can do it, then it goes without saying that the government can do it. The government can do anything you or I can, and on a much larger scale, except maybe balance their goddamn checkbook.

There's been a lot of speculation that the "Bush spy plot" has to do with data mining -- sieving large amounts of data (such as all the email sent through a particular server) for specific information (such as the use of the word "hijack" or "bomb") that might be of interest to counter-terrorism operators. My impression is that while there may be data mining a la that supposedly discontinued Total Information Awareness program, there's probably more to it than that. But, nonetheless ... How hard is data mining, really?

It's not that hard at all, especially if you're the government and can waltz into any ISP, flash a badge,tap into all the incoming and outgoing traffic, and run it through the largest and most powerful collection of computers in the known cosmos.

What you may not know is that it can be easy for us apes, too ... because a lot of sites have features which -- with your permission and even active connivance -- make the data available on request. You just have to know how to request it in large quantities and how to analyze it once you've got it.

Hmm ... there are 414 Osamas with Amazon.Com wishlists. The article explains how to find out that much simply by plugging a first name into a URL. After that, it's a matter of some operations even I can grasp (but don't feel like investing the time in actually doing) to figure out where they live (state/city/ZIP, anyway) and if any of them have expressed a burning desire to own piloting manuals, explosives guides, etc.

Not difficult (it just took an expert to figure it out and tell us how to do it). If you and I can do it, then it goes without saying that the government can do it. The government can do anything you or I can, and on a much larger scale, except maybe balance their goddamn checkbook.