Start-up seeks to spin a safer web

SiteAdvisor to police pages

Common Topics

File-sharing software that installs adware, websites that attempt to compromise a visitor's computer, and free downloads that install a host of other unwanted software - the web has become a confusing and sometimes dangerous place for the average home user.

A group of graduates from the Massachusetts Institute of Technology (MIT) aim to change that by crawling the web with hundreds, and soon thousands, of virtual computers that detect which websites attempt to download software to a visitor's computer and whether giving out an e-mail address during registration can lead to an avalanche of spam.

The goal is to create a service that lets the average internet user know what a website actually does with any information collected or what a download will do to a computer, Tom Pinckney, vice president of engineering and co-founder of the start-up SiteAdvisor, said during a presentation at the CodeCon conference here.

"We put a robot 'you' in first, and then you can see if it comes out with a bunch of arrows stuck in it," Pinckney said. "By using virtual yous, we end up measuring behaviors in a more holistic approach - creating a list of sites that are good and a list of sites that have dubious behaviour."

The service builds on research into data collected through legions of computers that automatically browse the internet, downloading free programs and filling out forms. Microsoft has used such client-side honeypots, which the software giant's researchers call honeymonkeys, to browse the riskier side of the web for servers that use zero-day exploits against visitors. The University of Washington used similar techniques to survey the web and find that one in 20 executables on the internet contain spyware.

SiteAdvisor wanted to take the model further and so created virtual computers that acted as if they were the most permissive users - downloading any offered program and entering in uniquely identifiable email addresses into any forms. The result is that the service can keep track of the consequences of visiting a site and giving that site information. So far, the company's browsing automatons - virtual Windows machines running on Linux computers - have browsed more than 1.6m websites.

The company intends to offer a free service to consumers and charge for a premium version of the service to make money. Internet users can use the service like a Zagat's or Consumer Reports for the web, Pinckney said, learning what will happen if they visit a site, install a program, give away an email address or agree to a massive end-user license agreement (EULA).

"If your car had an EULA 20 pages long that allowed the manufacturer to wallpaper your house in Exxon ads, would you go for that?" he asked. "That is what these sites are trying to do to our computers."

SiteAdvisor offers extensions for both Internet Explorer and Mozilla Firefox, putting the information easily visible in Google and adding an additional menu, where users can get more information about a particular site, in the address bar.

The technology to automate browsing the web and downloading tools does not always work, Pinckney said. In some cases, a human has to decide what is a good application and what is malicious code.

"We looked for the cheapest artificial intelligence to survey the questionable cases, and (the solution) ended up being hiring a bunch of guys in India," Pinckney said.

In the end, the research has also allowed the company to define the connections between malicious parts of the web, what Pinckney called the "malweb". Under the entry for a particular website, SiteAdvisor displays those links colour-coding them to indicate the severity of the satellite sites' dubious practices: red representing a site with questionable practices and green for legitimate sites.

While many groups will try to game the system and detect SiteAdvisor's web crawling automatons, good sites should be able to profit from getting a good rating. A service that lets the user instill more trust in certain sites could increase traffic to legitimate sites, Pinckney said.

"Because there are some bad screensaver sites out there, no one is downloading free screensavers anymore," he said. "This could help the good sites."

The company is currently previewing its service and plans to roll out the full version soon.