Tiny Transactions on Computer Science

Spammage

Posts tagged ‘site cache’

Hide from cache

If you don’t want web searchers to be able to access a cached version of your page, use the noarchive meta tag like this:

<meta name="robots" content="noarchive">

The page will still be crawled and indexed by Google, but users will not see a cached link in search results.

Similar to your website

The related: operator displays websites similar to the site you are looking for. It returns the same results as clicking Similar pages next to a result on the search results page.

I was curious about the results returned by Similar pages, as its intent is to return overlapping resources. Specifically, I was worried whether it indicated anything potentially detrimental, for search engine optimization purposes. According to Google, there’s no need for SEO concern, not for the moment:

The quality of the sites returned has no impact on your ranking or on how Google indexes your site.

The special searches help section in Google Webmaster Tools was updated for the first time in several years, as of October 2010. Special search results give insight about how your site is indexed by Google.

Google Webmaster Central Logo

Special site searches

This command returns the full list of special search queries:

info:operator

Search indexed pages

View all pages indexed by Google for your site using site:operator

Entering

site:google.com

returns all indexed pages for google.com

Note: Don’t use a space between the operator and the URL!

Google search results for domain Wikipedia.com and eight sub-domains

Search within a single domain or sub-domain

The same syntax is used whether searching an entire domain, or restricting the search to a sub-domain only.

The same syntax is also used to restrict search results to a specific sub-directory.

The command to search only within the webmasters sub-directory of site google.com is

site:google.com/webmasters

Exclude pages

To exclude particular pages from search, use a minus sign before the operator.

This would be the command to return results for all indexed pages on the google.com domain, without any adwords.google.com pages:

Pages that link to your site’s front page

Google advises using the first command syntax as it will return more complete results.

Links to pages

Search for all links to specific pages or sub-directories. This command will return all links to the webmasters sub-directory of domain google.com link:google.com/webmasters

The current cache of your site

View Google’s archived copy of an indexed web page using the cache:operator

This is sometimes called the cached version of the page. For example,

cache:google.com

displays the most recent version of the Google homepage google.com as well as the cache creation date. You may also view a plain-text version of the page. This is useful because it shows how Googlebot sees the page.

Pages that are similar to your site

The related:operator displays websites similar to the site you are looking for. It returns the same results as clicking Similar pages next to a result on the main Google Search results page. Google provides more detail:

This search is like searching a bookstore to find books similar to the first Harry Potter novel. The results could include other children’s books, a biography of J.K. Rowling, or a non-fiction book on children’s literature. In general, use this operator to find resources that overlap. You’ll get the best and most useful results if you use sites that cover a broad range of content.

Google uses several factors to determine the similarity of different sites but does not describe these factors any further, other than stating that

the quality of the sites returned has no impact on your ranking or on how Google indexes your site.