That’s now changed. You can now create a single XML Sitemap that contains any combination of these content types. The Google Webmaster Central blog post doesn’t mention News Sitemaps, so presumably news content can’t be mixed with the other types.

Great news for site owners? Possibly. It may be easier to create and maintain a single Sitemap in some cases, but the lowest overhead way to create and maintain Sitemaps generally is via a script that creates and updates the file automatically. And it might be easier to keep track of things separately.

Indexing metrics

Certainly, from a metrics perspective, it may make sense to keep content types separated. When you submit an XML Sitemap to Google Webmaster Tools, you can see a report of the total number of URLs in the Sitemap and the number of those URLs indexed. For a long time, I’ve suggested that site owners create separate Sitemaps for different types of content and page types to easily track what percentage of each of those types Google is indexing. (This report provides much more accurate numbers than a site: operator search, although it only works accurately if the Sitemap contains a comprehensive and canonical list of URLs for a category.)

The previous limits on URLs and size (50,000 URLs and 10MB) still apply as well, so many sites will find that everything won’t fit in one file in any case.

What about the sitemaps.org alliance?

The question I have about this move, however, is what about sitemaps.org? In 2006, Google, Microsoft, and Yahoo came together to support a joint protocol. While the standard XML Sitemaps protocol for web pages is supported jointly, Google launched specialized Sitemaps on their own, and not as part of that alliance. If site owners start modifying their web XML Sitemaps to include additional markup needed by the additional formats, won’t that break the other engines’ ability to parse the files? Doesn’t that mean that for all practical purposes, Google is encouraging site owners to submit XML Sitemap files that don’t adhere to the standard and can be used only by Google?

In spite of sitemaps.org and the later joint support of autodiscovery in robots.txt, it appears that Google isn’t keeping either in mind as they evolve XML Sitemaps. They may say that they would love for the other engines to support this new combined format, but they didn’t involve the Microsoft or Yahoo as they developed the specialized formats, and it’s unlikely they gave advance warning of this launch to give the other engines time to at least adjust their parsers to be able to handle this new combined format.

Maybe this move is no big deal. But a lot of work went into the launch of that alliance, with the goal of making things easier for content owners web-wide, and not just for one search engine. So the partial dismantling of it, even in spirit, is a bit disappointing.

Update: It’s true that the specialized elements shouldn’t break existing parsers, even if those parsers weren’t built specifically for those additional elements. However, I do think it’s quite possible that the existing (non-Google) parsers aren’t set up to process files with additional elements and this change could break them. It’s definitely the case that the search engines aren’t working together on these extensions, and I’d just like to see more cooperation and advancement of the original aims of the alliance: making it easier for content owners to work with search engines.

Some opinions expressed in this article may be those of a guest author and not necessarily Search Engine Land. Staff authors are listed here.

SMX Advanced is the only conference designed exclusively for experienced paid search advertisers and SEOs. You'll participate in experts-only sessions and network with fellow internet marketing thought leaders. Check out the tactic-packed agenda!

About The Author

Vanessa Fox is a Contributing Editor at Search Engine Land. She built Google Webmaster Central and went on to found software and consulting company Nine By Blue and create Blueprint Search Analytics< which she later sold. Her book, Marketing in the Age of Google, (updated edition, May 2012) provides a foundation for incorporating search strategy into organizations of all levels. Follow her on Twitter at @vanessafox.

Sponsored

Everything you need to know about SEO, delivered every Thursday.

http://seo-website-designer.com tiggerito

It should not break other parsers as long as they correctly understand XML.

The extended data is placed in its own namespace and so should be ignored by parsers that don’t understand that namespace and it’s associated schema. They are not breaking the sitemap schema, just extending it in the way XML (eXtensible Markup Language) was designed to be used.

In reality there may be some systems that do have issues though!

It may cause a bit of a war if the others decide on different schemas and we have to add multiple sets of data to please every search engine.

I think this may also happen with all the different and overlapping RDFa schemas that are coming out.

http://ninebyblue.com/ Vanessa Fox

*Should* be ignored by parsers, but as you point out, reality may be quite different. The real point though is the latter part of your comment. They just don’t seem to be working together on this anymore, which is why it seems like the alliance may be over in spirit.