Sitemap formats

Google supports several sitemap formats, described here. Google expects the standard sitemap protocol in all formats. Google does not currently consume the <priority> attribute in sitemaps.

All formats limit a single sitemap to 50MB (uncompressed) and 50,000 URLs. If you have a larger file or more URLs, you will have to break your list into multiple sitemaps. You can optionally create a sitemap index file (a file that points to a list of sitemaps) and submit that single index file to Google. You can submit multiple sitemaps and/or sitemap index files to Google.

If you have a blog with an RSS or Atom feed, you can submit the feed's URL as a sitemap. Most blog software is able to create a feed for you, but recognize that this feed only provides information on recent URLs.

If you've created and verified a site using Google Sites, Sites will automatically generate a sitemap for you. You cannot modify the sitemap, but you can submit it to Google if you want to read the sitemap report data. Note that your sitemap might not be displayed properly if you have more than 1,000 pages in a single sub-directory.

If your site is hosted at Google Sites, your sitemap URL is http://sites.google.com/site/yoursitename/system/feeds/sitemap

If you created your site using Google Apps, your sitemap URL is http://sites.google.com/yourdomain/yoursitename/system/feeds/sitemap

Sitemap extensions for additional media types

Google supports extended sitemap syntax for the following media types. Use these extensions to describe video files, images, and other hard-to-parse content on your site to improve indexing.

General sitemap guidelines

Use consistent, fully-qualified URLs. Google will crawl your URLs exactly as listed. For instance, if your site is at https://www.example.com/, don't specify a URL as https://example.com/ (missing www) or ./mypage.html (a relative URL).

Don't include session IDs from URLs in your sitemap to reduce duplicate crawling of those URLs.

Break up large sitemaps into smaller sitemaps to prevent your server from being overloaded if Google requests your sitemap frequently. A sitemap file can't contain more than 50,000 URLs and must be no larger than 50 MB uncompressed. Use a sitemap index file to list all the individual sitemaps and submit this single file to Google rather than submitting individual sitemaps.

Use recommended canonicalization methods to tell Google if your site is accessible on both the www and non-www versions of your domain. You need to submit a sitemap for only your preferred domain.

Use sitemap extensions for pointing to additional media types such as video, images, and news.

If you have different URLs for mobile and desktop versions of a page, we recommend pointing to only one version. However, if you feel the need to point to both URLs, annotate your URLs to indicate the desktop and mobile versions.

Non-alphanumeric and non-latin characters. We require your sitemap file to be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below. A sitemap can contain only ASCII characters; it can't contain upper ASCII characters or certain control codes or special characters such as * and {}. If your sitemap URL contains these characters, you'll receive an error when you try to add it.

Character

Escape Code

Ampersand

&

&amp;

Single Quote

'

&apos;

Double Quote

"

&quot;

Greater Than

>

&gt;

Less Than

<

&lt;

In addition, all URLs (including the URL of your sitemap) must be encoded for readability by the web server on which they are located and URL-escaped. However, if you are using any sort of script, tool, or log file to generate your URLs (anything except typing them in by hand), this is usually already done for you. If you submit your sitemap and you receive an error that Google is unable to find some of your URLs, check to make sure that your URLs follow the RFC-3986 standard for URIs, the RFC-3987 standard for IRIs, and the XML standard.

Here is an example of a URL that uses a non-ASCII character (ü), as well as a character that requires entity escaping (&):http://www.example.com/ümlat.html&q=name
Here is that same URL, ISO-8859-1 encoded (for hosting on a server that uses that encoding) and URL escaped:http://www.example.com/%FCmlat.html&q=name
Here is that same URL, UTF-8 encoded (for hosting on a server that uses that encoding) and URL escaped:http://www.example.com/%C3%BCmlat.html&q=name
Here is that same URL, entity escaped:http://www.example.com/%C3%BCmlat.html&amp;q=name

Make your sitemap available to Google (Submit your sitemap to Google)

There are a few different ways to make your sitemap available to Google:

Insert the following line anywhere in your robots.txt file, specifying the path to your sitemap:Sitemap: http://example.com/sitemap_location.xml

Use the "ping" functionality to ask us to crawl your sitemap. Send an HTTP GET request like this: http://www.google.com/ping?sitemap=<complete_url_of_sitemap>
for example: http://www.google.com/ping?sitemap=https://example.com/sitemap.xml