Sitemap formats

Google supports several sitemap formats, described here.

All formats limit a single sitemap to 50MB (uncompressed) and 50,000 URLs. If you have a larger file or more URLs, you will have to break your list into multiple sitemaps. You can optionally create a sitemap index file (a file that points to a list of sitemaps) and submit that single index file to Google. You can submit multiple sitemaps and/or sitemap index files to Google.

Google supports the standard sitemap protocol. Google also supports XML extensions for video, images, and news resources; use these extensions to describe video files, images, and other hard-to-parse content on your site to improve how we index these resources.

Here is a very basic XML sitemap that includes the location of a single URL:

If you have a blog with an RSS or Atom feed, you can submit the feed's URL as a sitemap. Most blog software is able to create a feed for you, but recognize that this feed only provides information on recent URLs.

If you've created and verified a site using Google Sites, Sites will automatically generate a sitemap for you. You cannot modify the sitemap, but you can submit it to Google if you want to read the sitemap report data. Note that your sitemap might not be displayed properly if you have more than 1,000 pages in a single sub-directory.

If your site is hosted at Google Sites, your sitemap URL is http://sites.google.com/site/yoursitename/system/feeds/sitemap

If you created your site using Google Apps, your sitemap URL is http://sites.google.com/yourdomain/yoursitename/system/feeds/sitemap

Sitemap extensions for additional media types

Google supports extended sitemap syntax for the following media types. Use these extensions to describe video files, images, and other hard-to-parse content on your site to improve indexing.

General sitemap guidelines

Use consistent, fully-qualified URLs. Google will crawl your URLs exactly as listed. For instance, if your site is at http://www.example.com/, don't specify a URL as /http://example.com/ (without the www) or ./mypage.html (a relative URL).

Don't include session IDs from URLs in your sitemap to reduce duplicate crawling of those URLs.

Point out translated versions of a URL to Google for crawling and indexing by listing the canonical URLs for each language in your sitemap file and by using hreflang annotations.

Break up large sitemaps into a smaller sitemaps to prevent your server from being overloaded if Google requests your sitemap frequently. A sitemap file can't contain more than 50,000 URLs and must be no larger than 50 MB uncompressed.

Use a sitemap index file to list all your sitemaps and submit this single file to Google rather than submitting individual sitemaps.

Use recommended canonicalization methods to tell Google if your site is accessible on both the www and non-www versions of your domain. You need to submit a sitemap for only your preferred domain.

Familiarize yourself with our Webmaster Guidelines, and our SEO Starter Guide if you're considering hiring a consultant to help you optimize your sitemaps. It can also be useful to check with colleagues with similar sites or businesses to get the most of your sitemap.

Use sitemap extensions for pointing to additional media types such as video, images, and news.

If you have different URLs for mobile and desktop versions of a page, we recommend pointing only to one version. However, if you feel the need to point to both URLs, annotate your URLs to indicate the desktop and mobile versions.

If you have alternate pages for different languages or regions, you can use either a sitemap or hreflang to indicate the alternate URLs.

Non-alphanumeric and non-latin characters. We require your sitemap file to be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below. A sitemap can contain only ASCII characters; it can't contain upper ASCII characters or certain control codes or special characters such as * and {}. If your sitemap URL contains these characters, you'll receive an error when you try to add it.

Character

Escape Code

Ampersand

&

&amp;

Single Quote

'

&apos;

Double Quote

"

&quot;

Greater Than

>

&gt;

Less Than

<

&lt;

In addition, all URLs (including the URL of your sitemap) must be encoded for readability by the web server on which they are located and URL-escaped. However, if you are using any sort of script, tool, or log file to generate your URLs (anything except typing them in by hand), this is usually already done for you. If you submit your sitemap and you receive an error that Google is unable to find some of your URLs, check to make sure that your URLs follow the RFC-3986 standard for URIs, the RFC-3987 standard for IRIs, and the XML standard.

Here is an example of a URL that uses a non-ASCII character (ü), as well as a character that requires entity escaping (&):http://www.example.com/ümlat.html&q=name
Here is that same URL, ISO-8859-1 encoded (for hosting on a server that uses that encoding) and URL escaped:http://www.example.com/%FCmlat.html&q=name
Here is that same URL, UTF-8 encoded (for hosting on a server that uses that encoding) and URL escaped:http://www.example.com/%C3%BCmlat.html&q=name
Here is that same URL, entity escaped:http://www.example.com/%C3%BCmlat.html&amp;q=name

Make your sitemap available to Google (Submit your sitemap to Google)

There are two different ways to make your sitemap available to Google: