Tagged Questions

Robots.txt is text file used by Website owners to give instructions about their site to web robots. Basically it tells robots which parts of the site are open and which parts are closed. This is called The Robots Exclusion Protocol.

As I understand, having repetitive content is a detriment to search engine placement.
Given that many websites that use similar or even identical "Terms and Conditions" and "Privacy Policy" pages due ...

As you all can see from the picture below, my site's content is duplicated by FeedReader and indexed at Google. When I clicked at the FeedReader link, it uses some sort of iFrame to draw content from ...

Although I put a row for Yandex into my robots.txt file, sometimes Yandex indexes my website aggressively. So I hard coded a part and check for user agent, and serve cached file if user agent is like ...

I noticed recently that Google is not caching all of the pages on my website. Upon using the Google webmaster diagnostic tool, I realized that some of my pages were being restricted by entries in my ...

While seeing the Google Webmaster Tools report there are certain URL encoded parameters like %5c and %22 coming up in URL's of site.
We tried to identify the issue and observed that due to incorrect ...

Google clearly states that duplicate content within a single, or multiple, domains is not advised. This is understood, but I am not sure of any exceptions for sites with region-specific content that ...

I have a couple of doubts / questions / ideas related to robots.txt:
Can we deny website for all bots except for chosen ones in order to
tell other bots not to crawl site:
User-agent: *
Disallow: /
...

In my blog's Google Webmaster Tools panel, I found the following code in my robots.txt of blocked URLs section.
User-agent: Mediapartners-Google
Disallow: /search
Allow: /
I know that Disallow will ...

I've recently switched to MVC3 which is extension-less for the URL's, but Google and Bing have a wealth of links that they are crawling which no longer exist.
So I'm trying to find out if there is a ...

We have a staging version of our website to test changes on at trailheadpaddleshack.ca/staging1. This never appeared in search before. Recently the staging site has appeared on Google and is affecting ...

Last week my site was utterly pummeled in Google's rankings - losing 95% of impressions overnight according to Google Webmaster Tools. It now only shows up if you search for the URL/site name itself.
...

This is the current robots.txt file I am using on a site of mine. I have excluded large parts of the IP.Board forum in order to try and cut down on duplicate content. I've also excluded some WordPress ...

Quick and simple question.
I have 80+ html files which I want to be crawled. They are individual product pages. Each of these pages calls its content using php includes. These php include files are ...

Is there an estimated amount of time before the document will not appear in google results?
We have a public records document. One person raised a concern when they saw their "PUBLIC INFORMATION" on ...

According to matt cutts, even though we block the page using Robots.txt , it is better using noindex tag. ( Source : https://www.youtube.com/watch?v=KBdEwpRQRD0)
If it is blocked by Robots.txt, how ...