Microsoft has a plan to improve Bing’s poor indexing

Microsoft has publicly admitted that Bing is slow at indexing and has outlined …

Recently a user on the Bing community forums asked why his new site was taking so long to be indexed by Redmond's search engine, to which a Microsoft employee replied that the slowness was an unfortunately expected behavior.

User prathaban1 wanted to know why www.kidandparent.in, which he submitted to the Webmaster Console using a sitemap.xml almost six weeks ago (and Bing calculates has about 14 backlinks) was not getting any love from Microsoft. He said he managed to get the homepage indexed with the help of a Brett Yount, Program Manager of the Bing Webmaster Center, but that was nothing compared to what Google and Yahoo had done: they indexed almost 400 and 200 pages, respectively.

In just four minutes (if only Bing indexed that quickly) Yount had posted his reply. The response wasn't good news. "It is well known in the industry that MSNbot is fairly slow," he wrote. "I suggest reading our FAQs stickied at the top of the indexing forum to get some ideas of what to do." The instructions Yount is referring to are outlined in a forum topic titled Sites not in the index:

[I]f your site is not in the index, please do the following:

verify in our tools that your site is not blocked

run a site: query to verify there are no pages in the index

Copy the URL of the site query and post on this thread.

I will work with you to at least get your home page indexed. Deeper indexing will require good content and backlinks as described in the FAQ.

Scanning through the thread, it appears that Yount is spending a lot of time telling users what they are doing wrong that results in Bing indexing their site incompletely or not at all. In other cases though, he is contacting the Bing indexing team so that they can figure out what is going on wrong on their end. At the time of writing, the thread had 858 replies with the first one posted on November 11, 2009.

MSN search, Live Search, and now Bing have all suffered from a very small index that is updated very slowly. One of the biggest problems Microsoft has with Bing is a basic one. While Bing has a lot of great features, it continues to struggle with the same issue previous incarnations of the service have. The actual index size, and the corresponding relevant results that come with those searches for the less popular queries, pales in comparison to what Google and Yahoo offer.

We asked Microsoft how it was planning improve Bing's indexing problem. "We're always working to improve the crawler," a Microsoft spokesperson told Ars. "With our latest crawler release still in beta, we doubled our crawling capacity worldwide. We increased our sitemap URL size to 50K and we made it easier for webmasters to control the crawler's aggressiveness."

The trouble is that the competition isn't waiting for Bing to catch up: their indexes are growing at rapid rates as well. Naturally, as the Internet continues to expand, so does the number of sites they index. "But we know we have to continue to build a system which better reflects the changing state of the Web," the Microsoft spokesperson continued. "We introduced the ability to crawl and surface fresh results in minutes last year which enables us to more aggressively index fast-changing domains (like news sites) and crawling itself isn't the answer to the new Web landscape."

That's where Microsoft's new strategy with Bing (compared to MSN search and Live search) comes in: it has become well aware that it does not have the resources available to take Google head-on. As such, the company is prettying up how Bing displays search results, focusing on social networking, and improving areas it thinks users are most interested in when they search, such as health and auto.

"For things like Visual Search and Twitter Search, we work with data providers to get high-quality, structured data that don't have to go through the computationally complex and never-100%-accurate process of mining pages for meaning," the Microsoft spokesperson told Ars. "As we work toward augmenting our decision engine with more tools to help customers actually get things done on the Web, the need for a new notion of crawling and indexing arises. While we're continually focused on how to improve freshness, speed, politeness and intelligence of what has to be crawled or crawled again, we have significant investments in semantic, structured, and real-time indexing that are required to do more than just return URLs for keywords."

To sum up, Microsoft wants to continue to improve in the areas that webmasters expect from a standard search crawler, while at the same time broadening how Bing collects and presents information from across the Web's constantly changing data.

Microsoft's strategy may be spot on for slowly gaining marketshare, but it won't last if webmasters who want to care about Bing are getting frustrated. They aren't interested in waiting around till Microsoft improves the MSNBot to a point where it notices their websites automatically, and the solution Microsoft currently offers is flawed. Microsoft hopes that the new Search Engine Optimization Toolkit it posted this week that works with Google, Yahoo, and Bing will help.