Sunday, November 2, 2008

Webmasters often ask why Google can't give more information about malware on websites. A couple of factors come into play.

Google's automated scanners only see the end result of the infection. They have no way of knowing how or exactly when the malicious content was added to a website. There are many ways it could have happened, from sql injection to stolen passwords. The automated scanners just know the bad content is there now.

To some extent, the success of Google's automated scanners relies on the secrecy of the methods used. The team has published malware papers (like Ghost in the Browser [pdf]) that include overviews of the system. But we're uncomfortable publishing any more detail because malware authors can read that information too. We've observed malware trying to hide from our automated systems and a large part of my job is tweaking the system to adapt to new types of malware. Staying on top of what malware is doing isn't easy and providing exact descriptions of the scanners and what they detect would make it impossible.

Earlier this year Google released the Safe Browsing diagnostic page which list general information about what Google's automated scanners found on a website. Some webmasters have found that information helpful in cleaning up their site. Hopefully we can find other ways to share information with webmasters without compromising the success of the scanning system. If you've got specific ideas, please let me know!