Alternatively, you could manually create a sitemap by following XML sitemap code structure. Technically, your sitemap doesn’t even need to be in XML format — a text file with a new line separating each URL will suffice.

However, you will need to generate a complete XML sitemap if you want to implement the hreflang attribute, so it’s much easier just to let a tool do the work for you.

Visit the official Google and Bing pages for more information on how to manually set up your sitemap.

2. Submit Your Sitemap to Google

Test your sitemap and view the results before you click Submit Sitemap to check for errors that may prevent key landing pages from being indexed.

Ideally, you want the number of pages indexed to be the same as the number of pages submitted.

Note that submitting your sitemap tells Google which pages you consider to be high quality and worthy of indexation, but it does not guarantee that they’ll be indexed.

Instead, the benefit of submitting your sitemap is to:

Help Google understand how your website is laid out.

Discover errors you can correct to ensure your pages are indexed properly.

3. Prioritize High-Quality Pages in Your Sitemap

When it comes to ranking, overall site quality is a key factor.

If your sitemap directs bots to thousands of low-quality pages, search engines interpret these pages as a sign that your website is probably not one visitors will want to visit — even if the pages are necessary for your site, such as login pages.

Instead, try to direct bots to the most important pages on your site. Ideally, these are pages that are:

Highly optimized.

Include images and video.

Have lots of unique content.

Prompt user engagement through comments and reviews.

4. Isolate Indexation Problems

Google Search Console can be a bit frustrating if it doesn’t index all of your pages because it doesn’t tell you which pages are problematic.

For example, if you submit 20,000 pages and only 15,000 of those are indexed, you won’t be told what the 5,000 “problem pages” are.

This is especially true of large e-commerce websites that have multiple pages for very similar products.

SEO Consultant Michael Cottam has written a useful guide for isolating problematic pages. He recommends splitting product pages into different XML sitemaps and testing each of them.

Create sitemaps that will affirm hypotheses, such as “pages that don’t have product images aren’t getting indexed” or “pages without unique copy aren’t getting indexed.”

When you’ve isolated the main problems, you can either work to fix the problems or set those pages to “noindex,” so they don’t diminish your overall site quality.

Update: Google Search Console has been recently updated in terms of Index Coverage. In particular, the problem pages are now listed and the reasons why Google isn’t indexing some URLs are provided.

5. Include Only Canonical Versions of URLs in Your Sitemap

When you have multiple pages that are very similar, such as product pages for different colors of the same product, you should use the “link rel=canonical” tag to tell Google which page is the “main” page they should crawl and index.

Bots have an easier time discovering key pages if you don’t include pages with canonical URLs pointing at other pages.

6. Use Robots Meta Tag over Robots.txt Whenever Possible

When you don’t want a page to be indexed, you usually want to use the meta robots “noindex,follow” tag.

This prevents Google from indexing the page but it preserves your link equity, and it’s especially useful for utility pages that are important to your site but shouldn’t be showing up in search results.

The only time you want to use robots.txt to block pages is when you’re eating up your crawl budget.

If you notice that Google is re-crawling and indexing relatively unimportant pages (e.g., individual product pages) at the expense of core pages, you may want to use robots.txt.

7. Don’t Include ‘noindex’ URLs in Your Sitemap

Speaking of wasted crawl budget, if search engine robots aren’t allowed to index certain pages, then they have no business being in your sitemap.

When you submit a sitemap that includes blocked and “noindex” pages, you’re simultaneously telling Google “it’s really important that you index this page” and “you’re not allowed to index this page.”

Lack of consistency is a common mistake.

8. Create Dynamic XML Sitemaps for Large Sites

It’s nearly impossible to keep up with all of your meta robots on huge websites.

Instead, you should set up rules logic to determine when a page will be included in your XML sitemap and/or changed from noindex to “index, follow.”

You can find detailed instructions on exactly how to create a dynamic XML sitemap but, again, this step is made much easier with the help of a tool that generates dynamic sitemaps for you.

9. Use XML Sitemaps & RSS/Atom Feeds

Google recommends using both sitemaps and RSS/Atom feeds to help search engines understand which pages should be indexed and updated.

By including only recently updated content in your RSS/Atom feeds you’ll make finding fresh content easier for both search engines and visitors.

10. Update Modification Times Only When You Make Substantial Changes

Don’t try to trick search engines into re-indexing pages by updating your modification time without making any substantial pages to your page.

Last year, I talked at length about the potential dangers of risky SEO. I won’t reiterate all my points here, but suffice it to say that Google may start removing your date stamps if they’re constantly updated without providing new value.

11. Don’t Worry Too Much About Priority Settings

Some Sitemaps have a “Priority” column that ostensibly tells search engines which pages are most important.

12. Keep File Size as Small as Possible

The smaller your sitemap, the less strain you’re putting on your server.

Google and Bing both increased the size of accepted sitemap files from 10 MB to 50 MB in 2016, but it’s still good practice to keep your sitemap as lean as possible and prioritize your key landing pages.

13. Create Multiple Sitemaps If Site Includes >50,000 URLs

You’re limited to 50,000 URLs per sitemap.

While this is more than enough for most sites, some sites will need to create more than one sitemap.