Site errors: This section of the report shows the main issues for the past 90 days that prevented Googlebot from accessing your entire site (click any box to display its chart).

URL errors: This section lists specific errors Google encountered when trying to crawl specific desktop or phone pages. Each main section in the URL Errors reports corresponds to the different crawling mechanisms Google uses to access your pages, and the errors listed are specific to those kinds of pages.

Site errors overview

In a well-operating site, the Site errors section of the Crawl Errors report should show no errors (this is true for the large majority of the sites we crawl). If Google detects any appreciable number of site errors, we'll try to notify you in the form of a message, regardless of the size of your site.

When you first view the Crawl Errors page, the Site errors section shows a quick status code next to the each of the three error types: DNS, Server connectivity, and robots.txt fetch. If the codes are anything other than a green check mark, you can click the box to see a graph of crawling details for the last 90 days.

High error rates

If your site shows a 100% error rate any of the three categories, it likely means that your site is either down or misconfigured in some way. This could be due to a number of possibilities that you can investigate:

Check that a site reorganization hasn't changed permissions for a section of your site.

If your site has been reorganized, check that external links still work.

Review any new scripts to ensure they are not malfunctioning repeatedly.

Make sure all directories are present and haven't been accidentally moved or deleted.

If none of these situations apply to your site, the error rate might just be a transient spike, or due to external causes (someone has linked to non-existent pages), so there might not even be a problem. In any case, when we see an unusually large number of errors for your site, we'll let you know so you can investigate.

Low error rates

If your site has an error rate less than 100% in any of the categories, it could just indicate a transient condition, but it could also mean that your site is overloaded or improperly configured. You might want to investigate these issues further, or ask about them on our forum. We might alert you even if the overall error rate is very low — in our experience, a well configured site shouldn't have any errors in these categories.

Site error types

What are DNS errors?

A DNS error means that Googlebot can't communicate with the DNS server either because the server is down, or because there's an issue with the DNS routing to your domain. While most DNS warnings or errors don't affect Googlebot's ability to access your site, they may be a symptom of high latency, which can negatively impact your users.

Fixing DNS errors

Make sure Google can crawl your site.
Use Fetch as Google on a key page, such as your home page. If it returns the content of your homepage without problems, you can assume that Google is able to access your site properly.

For persistent or re-occuring DNS errors, check with your DNS provider.
Often your DNS provider and your web hosting service are the same.

Configure your server to respond to non-existent hostnames with an HTTP error code such as 404 or 500.
A website such as example.com can be configured with a wildcard DNS setup to respond to requests for foo.example.com, made-up-name.example.com and any other subdomain. This makes sense in the case where a site with user-generated content gives each user account its own domain (http://username.example.com). However, in some cases, this kind of configuration can cause content to be unnecessarily duplicated across different hostnames, and it can also affect Googlebot's crawling.

DNS error list

Error Type

Description

DNS Timeout

Google couldn't access your site because your DNS server did not respond to the request in a timely manner.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

Check with your registrar to make sure your site is correctly set up and that your server is connected to the Internet.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

Check with your registrar to make sure your site is correctly set up and that your server is connected to the Internet.

What is a server error?

When you see this kind of error for your URLs, it means that Googlebot couldn't access your URL, the request timed out, or your site was busy. As a result, Googlebot was forced to abandon the request.

Fixing server connectivity errors

Reduce excessive page loading for dynamic page requests.
A site that delivers the same content for multiple URLs is considered to deliver content dynamically (e.g. www.example.com/shoes.php?color=red&size=7 serves the same content as www.example.com/shoes.php?size=7&color=red). Dynamic pages can take too long to respond, resulting in timeout issues. Or, the server might return an overloaded status to ask Googlebot to crawl the site more slowly. In general, we recommend keeping parameters short and using them sparingly. If you're confident about how parameters work for your site, you can tell Google how we should handle these parameters.

Check that you are not inadvertently blocking Google.
You might be blocking Google due to a system level issue, such as a DNS configuration issue, a misconfigured firewall or DoS protection system, or a content management system configuration. Protection systems are an important part of good hosting and are often configured to automatically block unusually high levels of server requests. However, because Googlebot often makes more requests than a human user, it can trigger these protection systems, causing them to block Googlebot and prevent it from crawling your website. To fix such issues, identify which part of your website's infrastructure is blocking Googlebot and remove the block. The firewall may not be under your control, so you may need to discuss this with your hosting provider.

Control search engine site crawling and indexing wisely.
Some webmasters intentionally prevent Googlebot from reaching their websites, perhaps using a firewall as described above. In these cases, usually the intent is not to entirely block Googlebot, but to control how the site is crawled and indexed. If this applies to you, check the following:

Server connectivity errors

Error Type

Description

Timeout

The server timed out waiting for the request.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Truncated headers

Google was able to connect to your server, but it closed the connection before full headers were sent. Please check back later.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Connection reset

Your server successfully processed Google's request, but isn't returning any content because the connection with the server was reset. Please check back later.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Truncated response

Your server closed the connection before we could receive a full response, and the body of the response appears to be truncated.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Connection refused

Google couldn't access your site because your server refused the connection. Your hosting provider may be blocking Googlebot, or there may be a problem with the configuration of their firewall.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Connect failed

Google wasn't able to connect to your server because the network is unreachable or down.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

Connect timeout

Google was unable to connect to your server.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Googlebot is generally able to access your site properly.

Check that your server is connected to the Internet. It's also possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

No response

Google was able to connect to your server, but the connection was closed before the server sent any data.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Googlebot is generally able to access your site properly.

It’s possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

What is a robots failure?

This is an error to retrieve your site's robots.txt file. Before Googlebot crawls your site, and roughly once a day after that, Googlebot retrieves your robots.txt file to see which pages it should not be crawling. If your robots.txt file exists but is unreachable (in other words, if it doesn't return a 200 or 404 HTTP status code), we'll postpone our crawl rather than risk crawling URLs that you do not want crawled. When this happens, Googlebot will return to your site and crawl it as soon as we can successfully access your robots.txt file. More information about the robots exclusion protocol.

Fixing robots.txt file errors

You don't always need a robots.txt file.
You need a robots.txt file only if your site includes content that you don't want search engines to index. If you want search engines to index everything in your site, you don't need a robots.txt file—not even an empty one. If you don't have a robots.txt file, your server will return a 404 when Googlebot requests it, and we will continue to crawl your site. No problem.

Make sure your robots.txt file can be accessed by Google.
It's possible that your server returned a 5xx (unreachable) error when we tried to retrieve your robots.txt file. Check that your hosting provider is not blocking Googlebot. If you have a firewall, make sure that its configuration is not blocking Google.

URL errors overview

The URL errors section of the report is divided into categories that show the top 1,000 URL errors specific to that category. Not every error that you see in this section requires attention on your part, but it's important that you monitor this section for errors that can have a negative impact on your users and on Google crawlers. We've made this easier for you by ranking the most important issues at the top, based on factors such as the number of errors and pages that reference the URL. Specifically, you'll want to consider the following:

Fix Not Found errors for important URLs with 301 redirects. While it's normal to have Not Found (404) errors, you'll want to address errors for important pages linked to by other sites, older URLs you had in your sitemap and have since deleted, misspelled URLs for important pages, or URLs of popular pages that no longer exist on your site. This way, the information that you care about can be easily accessed by Google and your visitors.

Update your sitemaps. Prune old URLs from your sitemaps, and if you add newer sitemaps that you intend to replace older ones, be sure to delete the old site map (not redirect it to the newer one).

Keep redirects clean and short. If you have a number of URLs that redirect in a sequence (e.g. pageA > pageB > pageC > pageD), it can be challenging for Googlebot to follow and interpret the sequence. Try to keep the "hops" to a low number. Read more about Not followed.

Viewing URL error details

You can view URL errors in a variety of ways:

Click Download to retrieve a list of the top 1,000 errors for that crawler type (e.g. desktop, smartphone).

Use the filter above the table to locate specific URLs.

See error details by following the link from individual URLs or Application URIs.

Desktop or phone URLs error details show status info on the error, a list of pages that reference the URL, and a link to Fetch as Google so you can troubleshoot problems that URL.

Mark URL errors as fixed

Once you've addressed the issue causing an error for a specific item, you can hide it from the list. You can do this singly or in bulk. Select the checkbox next to the URL, and click Mark as fixed. The URL will be removed from the list. However, this marking is just a convenience method for you; if Google's crawler encounters the error on the next crawl, the URL will reappear in the list the next time your URL is crawled.

Usually, when a visitor requests a page on your site that doesn't exist, a web server returns a 404 (not found) error. This HTTP response code clearly tells both browsers and search engines that the page doesn't exist. As a result, the content of the page (if any) won't be crawled or indexed by search engines.

A soft 404 occurs when your server returns a real page for a URL that doesn't actually exist on your site. This usually happens when your server handles faulty or non-existent URLs as "OK," and redirects the user to a valid page like the home page or a "custom" 404 page.

This is a problem because search engines might spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site's crawl coverage because your real, unique URLs might not be discovered as quickly or visited as frequently due to the time Googlebot spends on non-existent pages.

If your page is truly gone and has no replacement, we recommend that you configure your server to always return either a 404 (Not found) or a 410 (Gone) response code in response to a request for a non-existing page. You can improve your visitors' experience by setting up a custom 404 page when returning a 404 response code. For example, you could create a page containing a list of your most popular pages, or a link to your home page, or a feedback link. But it's important to remember that it's not enough to just create a page that displays a 404 message. You also need to return the correct 404 or 410 HTTP response code.

404

Googlebot requested a URL that doesn't exist on your site.

Fixing 404 errors

Most 404 errors don't affect your site's ranking in Google, so you can safely ignore them. Typically, they are caused by typos, site misconfigurations, or by Google's increased efforts to recognize and crawl links in embedded content such as JavaScript. Here are some pointers to help you investigate and fix 404 errors:

Decide if it's worth fixing. Many (most?) 404 errors are not worth fixing. Here's why:Sort your 404s by priority and fix the ones that need to be fixed. You can ignore the other ones, because 404s don't harm your site's indexing or ranking.

If it is a deleted page that has no replacement or equivalent, returning a 404 is the right thing to do.

If it is a bad URL generated by a script, or that never have existed on your site, it's probably not a problem you need to worry about. It might bother you to see it on your report, but you don't need to fix it, unless the URL is a commonly misspelled link (see below).

See where the invalid links live. Click a URL to see Linked from these pages information. Your fix will depend on whether the link is coming from your own or from another site:

Fix links from your own site to missing pages, or delete them if appropriate.

If the content has moved, add a redirect.

If you have permanently deleted content without intending to replace it with newer, related content, let the old URL return a 404 or 410. Currently Google treats 410s (Gone) the same as 404s (Not found). Returning a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404) can be problematic. Such pages are called soft 404s, and can be confusing to both users and search engines.

If the URL is unknown: You might occasionally see 404 errors for URLs that never existed on your site. These unexpected URLs might be generated by Googlebot trying to follow links found in JavaScript, Flash files, or other embedded content, or possibly that exist only in a sitemap. For example, your site may use code like this to track file downloads in Google Analytics:

When Googlebot sees this code, it might try to crawl the URL http://www.example.com/download-helloworld, even though it's not a real page. In this case, the link may appear as a 404 (Not Found) error in the Crawl Errors report. Google is working to prevent this type of crawl error. This error has no effect on the crawling or ranking of your site.

Fix misspelled links from other sites with 301 redirects. For example, a misspelling of a legitimate URL (www.example.com/redshoos instead of www.example.com/redshoes) probably happened when someone linking to your site simply made a typo. In this case, you can capture that misspelled URL by creating a 301 redirect to the correct URL. You can also contact the webmaster of a site with an incorrect link, and ask for the link to be updated or removed.

Ignore the rest of the errors. Don't create fake content, redirect to your homepage, or use robots.txt to block those URLs—all of these things make it harder for us to recognize your site’s structure and process it properly. We call these soft 404 errors. Note that clicking This issue is fixed in the Crawl Errors report only temporarily hides the 404 error; the error will reappear the next time Google tries to crawl that URL. (Once Google has successfully crawled a URL, it can try to crawl that URL forever. Issuing a 300-level redirect will delay the recrawl attempt, possibly for a very long time.)

Access denied

In general, Google discovers content by following links from one page to another. To crawl a page, Googlebot must be able to access it. If you're seeing unexpected Access Denied errors, it may be for the following reasons:

Googlebot couldn't access a URL on your site because your site requires users to log in to view all or some of your content.

Your robots.txt file is blocking Google from accessing your whole site or individual URLs or directories.

Your server requires users to authenticate using a proxy, or your hosting provider may be blocking Google from accessing your site.

To fix:

Test that your robots.txt is working as expected and does not block Google. The Test robots.txt tool lets you see exactly how Googlebot will interpret the contents of your robots.txt file. The Google user-agent is Googlebot.

Use Fetch as Google to understand exactly how your site appears to Googlebot. This can be very useful when troubleshooting problems with your site's content or discoverability in search results.

Not followed

Not followed errors lists URLs that Google could not completely follow, along with some information as to why. Here are some reasons why Googlebot may not have been able to follow URLs on your site:

Flash, JavaScript, active content

Some features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash can make it difficult for search engines to crawl your site. Check the following:

Use a text browser such as Lynx to examine your site, since many search engines see your site much as Lynx would. If features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site.

If you use dynamic pages (for instance, if your URL contains a ? character), be aware that not all search engine spiders crawl dynamic and static pages. In general, we recommend keeping parameters short and using them sparingly. If you're confident about how parameters work for your site, you can tell Google how we should handle them.

Redirects

If you are permanently redirecting from one page to another, make sure you're returning the right HTTP status code (301 Moved Permanently).

Where possible, use absolute rather than relative links. (For instance, when linking to another page in your site, link to www.example.com/mypage.html rather than simply mypage.html).

Try to make every page on your site reachable from at least one static text link. In general, minimize the number of redirects needed to follow a link from one page to another.

Check your redirects point to the right pages! Sometimes we discover redirects that point to themselves (resulting in a loop error) or to invalid URLs.

Don't include redirected URLs in your Sitemaps.

Keep your URLs as short as possible. Make sure you aren't automatically appending information (such as session IDs) to your redirect URLs.

Make sure your site allows search bots to crawl your site without session IDs or arguments that track their path through the site.

DNS error

When you see this error for URLs, it means that Googlebot could either not communicate with the DNS server, or your server had no entry for your site.

The Faulty redirect error appears in the URL Errors section of the Crawl > Crawl Errors page under the Smartphones tab.

Some websites use separate URLs to serve desktop and smartphone users and configure desktop pages to direct smartphone users to the mobile site (e.g. m.example.com). A faulty redirect occurs when a desktop page incorrectly redirects smartphone users to a smartphone page not relevant to their query. A typical example of this occurs when all desktop pages redirect smartphone users to the homepage of the smartphone-optimized site. In the figure below, the redirects shown with red arrows indicate faulty redirects:

This kind of redirect disrupts users' workflow and can cause them to stop using the site and look elsewhere.

Following are some tips to help you create a mobile-friendly search experience and avoid faulty redirects:

Use the example URLs provided in the report as a starting point to debug exactly where the problem is with your server configuration.

Set up your server so that it redirects smartphone users to the equivalent URL on your smartphone site.

If a page on your site doesn't have a smartphone equivalent, keep users on the desktop page, rather than redirecting them to the smartphone site's homepage. Doing nothing is better than doing something wrong in this case.

Finally, read our recommendations for having separate URLs for desktop and smartphone users.

URLS blocked for smartphones

The "Blocked" error appears on the Smartphone tab of the URL Errors section of the Crawl > Crawl Errors page. If you get the "Blocked" error for a URL on your site, that means that the URL is blocked for Google's smartphone Googlebot in your site's robots.txt file.

This may not necessarily be a smartphone-specific error (for example, the equivalent desktop pages may also be blocked). However, it often indicates that the robots.txt file needs to be modified to allow crawling of smartphone-enabled URLs. When the smartphone-enabled URLs are blocked, the mobile pages can't be crawled and because of this, they may not appear in search results.

If you get the "Blocked" smartphone crawl error for URLs on your site, examine your site's robots.txt file and make sure that you are not inadvertently blocking parts of your site from being crawled by Googlebot for smartphones.

The Flash content error appears in the URL Errors section of the Crawl > Crawl Errors page under the Smartphones tab.

Our algorithms list URLs in this section as having content rendered mostly in Flash. Many devices cannot render these pages because Flash is not supported by iOS or Android versions 4.1 and higher.

We recommend that you improve the mobile experience for your website by using responsive web design for your site, a practice recommended by Google for building search-friendly sites for all devices. You can learn more about this in Web Fundamentals, a comprehensive resource for multi-device web development.

Whichever approach you take to address this issue, be sure to allow Googlebot access to all assets of your site (CSS, JavaScript, and images) and do not block them with robots.txt or by other means. Our algorithms need these external files to detect your site's design configuration and treat it appropriately. You can make sure our indexing algorithms have access to your site by using the Fetch as Google feature in Search Console.

Crawl errors are organized into categories, such as "Article extraction " or "Title error." Clicking on one of these categories will display a list of affected URLs and the crawl errors they're generating.

Note: Please keep in mind that our news index is compiled by computer algorithms. While we strive to include as much of your content as possible, we can't guarantee the inclusion of every single article. We appreciate your understanding.

Error

Description

Article disproportionately short

The article body that we extracted from the HTML page is too small when compared to other clusters of text without links on the page. This applies to most pages that contain news briefs or multimedia content, rather than full news articles. We generated this error to avoid including what might be an incorrect piece of text.

Recommendations

This problem is often caused by:

Too many snippets for related articles - to help our extractor please consider making these snippets clickable.

Features such as 'Send this article to friends' with long descriptions - consider setting a "display:none" or "visibility:hidden" style to make the text invisible or writing the pieces of HTML code by JavasScript dynamically.

User comments - consider enclosing the comments in an iframe, dynamically fetching them with AJAX or moving them to an adjacent page.

Article fragmented

The article body that we extracted from the HTML page appears to consist of isolated sentences not grouped together into paragraphs. We generated this error to avoid including what might be an incorrect piece of text.

Recommendations

Check that your paragraphs are formatted such that each is more than one sentence in length.

Make sure your sentences are well punctuated.

Make sure you don't use frequent <br> and <p> tags within your paragraphs, and try to avoid breaking up the article body in general.

Consider removing some of the non-article text from the article page.

Article too long

The article body that we extracted from the HTML page appears to be too long to be a news article. We generated this error to avoid including what might be an incorrect piece of text. Common causes include news articles that contain user-contributed comments below the article, or HTML layouts that contain other material besides the news article itself.

Recommendations

Consider removing some of the non-article text from the article page. If the article page contains user comments, consider one of the following options:

enclosing them in an iframe.

dynamically fetching them with AJAX.

moving part of the comments to an adjacent page.

Article too short

The article body that we extracted from the HTML page appears to contain too few words to be a news article. This applies to most pages that contain news briefs or multimedia content, rather than full news articles. We generated this error to avoid including what might be an incorrect piece of text.

Recommendations

Try formatting your articles into text paragraphs of a few sentences each. If the article content appears to contain too few words to be a news article, we won't be able to include it.

Make sure your articles have more than 80 words.

Date not found

We were unable to determine the publication date of the article.

Recommendations

Follow the date formatting recommendations below:

Place a clear date and time for each of your articles in between the article's title and the article's text in a separate line of HTML. The date should specify when the article was first published.

Remove any other dates from the HTML of the article page so that the crawler doesn't mistake them for the correct publication time.

If you'd like to use a date metatag, please contact us first. Date meta tags should be of the form: <meta name="DC.date.issued" content="YYYY-MM-DD">, where the date is in W3C format, using either the "complete date" (YYYY-MM-DD) format, or the "complete date plus hours, minutes and seconds" (YYYY-MM-DDThh:mm:ssTZD) format with a time zone suffix.

Create a News Sitemap. The <publication_date> tag will ensure we're able to pick the correct date for your articles.

Date too old

The date that we determined for this article, either from a <publication_date> tag in the Sitemap, or from a date in the page HTML itself, is too old.

Recommendations

Make sure your article is less than 2 days old. Currently we are only collecting articles that are 2 days old or less.

Follow the date formatting recommendations above.

Empty article

The article body that we extracted from the HTML page appears to be empty.

Recommendations

Make sure that the full text of each of your articles is available in the source code of your article pages (and not embedded in a JavaScript file or iframe, for example).

Make sure that you're not using a style in the source code of your articles such as "display:none" or "visibility:hidden".

Make sure the links to your articles lead directly to your articles pages rather than to an intermediate page using a Javascript redirect.

Extraction failed

We were unable to extract the article from the page. Extractions fail when we are unable to identify a valid title, body, and timestamp for the article. We list URLs with this error to provide you with information regarding why some articles may not appear in Google News.

Recommendations

Make sure that your title, body, and timestamp are easily crawlable (are available as text and not as images, for instance), but at this time, this error is primarily for informational purposes. We are actively working to improve our extraction methods so that you'll see this error less often.

The article body that we extracted from the HTML page appears not to contain punctuated sequences of contiguous words. We generated this error to avoid including what might be an incorrect section of text.

Recommendations

If the article content doesn't have punctuated sequences of contiguous words, we won't be able to include it in Google News. Make sure that the text of your articles is made up of sentences, and that you don't use frequent <br> or <p> tags within your paragraphs.

Make sure that the full text of each of your articles is available in the source code of your article pages (and not embedded in a JavaScript file, for example).

Make sure the links to your articles lead directly to your articles pages rather than to an intermediate page using a JavaScript redirect.

Off-site redirect

The section or article page redirects to a URL on a different domain.

Recommendations

All section pages and articles must be located within the domain of the site included in Google News.

If you are not using off-site redirects, please make sure your site has not been modified by a third party. Read moreabout hacked sites.

Page too large

The section or article page length exceeds the maximum allowed.

Recommendation

The HTML source page can be up to 256KB in size.

Title not allowed

The title that we extracted from the HTML page suggests that it is not a news article.

Recommendation

Often this problem can be fixed by setting the <title> tag on the HTML page to the title of the article, and repeating the title in a prominent place on the HTML page, such as in an <h1> tag. Read more about titles.