Apr 11, 2011

Predator Detection Bots

Crisp Thinking announced today that its community management platform will be used by Jabble, a new children's social networking site. Says Crisp Thinking:

"The technology has an independently tested accuracy rating of 98.4 per cent by Cambridge University in the UK in the detection of online predators. It uses a combination of machine learning heuristics to detect long-term behaviours, concept analysis, filtering technology and reputation analysis to help keep children safe online. The Platform operates in multi-languages and ensures communities, social networks and online games remains COPPA compliant."

Comments

A note of caution: What do they mean by "accuracy rating of 98.4" and how did they measure that?

Also remember what that accuracy rating means. A 98.4% accuracy equates to a 1.6% error rate. Say I have a population that consists of 100,000 regular users and 1,000 predators. Thus, approximately 1% of users are predators.

However we'll also mis-identify 1,600 regular users as predators. More regular users are falsely labeled than predators caught. The numbers get worse as the proportion shifts. If the non-predator population is 1,000,000 then the false-positive number rises to 16,000, totally swamping the real positives.

>"The technology has an independently tested accuracy rating of 98.4 per cent by Cambridge University in the UK in the detection of online predators."

So where did "Cambridge University" get these online predators from, so they could test whether or not the software picked up their grooming attempts? I had a look at the Crisp Thinking site but couldn't find any links to the research, just the figure of 98.4% . For all I know, some second-year undergraduate could have typed 62 lines that they thought a groomer might use and one of them didn't get caught by the software. On the other hand, there could have been a major survey using police transcripts of actual grooming sessions and they caught the activity 98.4% of the time with no false positives.

I realise that on a web site they're not necessarily going to publish the research, but if I were going to use their system I'd want to see it.

I agree with Richard, 98.4% is an awful specific number for a rather unspecific test. The testing methodology is also not known and why it caught certain "people". Security systems rely on their effectiveness through quality, not secrecy.

Thats an incredibly high level of accuracty, but how many of those people who are gauged as being preditory in their online activities are. When it comes to the safety of our children any possible help we can get is worth the few who will inevitably be wrongly labelled.