Google's Algorithm... Why Google is Failing.

Google has initiated several questionable algorithm changes over the last few years: sandbox, domain age, links aging, duplicate content, etc. It has frustrated many, but in the commotion of what it has done to our wallets, we rarely have considered whether or not these changes were in the best interests of searchers. I would like to comment on some of these. Please read the whole thing before you respond, there is a lot to be said.

(1) Duplicate Content

This is perhaps the greatest deceit committed by Google over the last few years. They have convinced us all that duplicate content is not valuable (duplicated across multiple sites or on the same site). This is simply not the case...

-> Duplicate content between multiple sites is actually evidence of its value. If an article is replicated across several web sites, it is evidence of its value to the community. Google's intention is to only show one of those results - which is a huge deceit commited against the searcher. He or she is led to believe that the article listed #3 is alone, even though it has been reprinted by 50 other publications. Duplicate content is representative of a beautiful, hand-audited network of quality content. The very essence of Google was this concept of links being like book citations. Book citations include duplicate content. They copy valuable content and cite the source. This is exactly what search engines NEED to be useful!

-> Duplicate content on the same site is far more representative of providing value to the user than of trying to dupe the search engines. A user-friendly ecommerce web site should present a terms of service, privacy statement, refund description, contact information, etc. on every product ordering page. Users should not have to click off to separate pages of the site to search out what their policy is on all of these issues. But, because these policies are uniform, this kind of information overshadows the unique content on a page and often initiates a duplicate content filter, causing the product pages to become supplementals.

-> Duplicate content is often a side effect of a comprehensive site. A site with 25 sections included in a navigation bar, parts duplicated across the top and bottom for usability, with bread crumbs is cannon fodder for a duplicate content filter. Webmasters are relegated to hiding these sections in javascript, flash, or image maps to avoid the penalty, but in the end do great damage to usability, especially for the disabled.

-> Unique content is often totally valueless. Take for example a site that sells car parts. They have windshield wipers specific to every make and model of cars released in the United States over the last 10 years. How many different ways does Google expect a site to say "These are the windshield wipers for a 1990 Dodge Caravan"? What ends up occuring is these pages get duplicate content penalties and, instead, sites that offer a smaller number of products end up ranking higher. Thus, users get placed onto sites where they are less likely to find other products that might fit their vehicle.

The true problem with duplicate content is when it is combined with other more nefarious search engine strategies, such as cloaking or doorway pages. Duplicate (on-site) content that stands alone is far less likely to be nefarious or created with the intent of duping the search engines.

Once again, what is the result of product pages not ranking because of duplicate content? Buy your way into Google Adwords.

(2) Domain Aging

A general trend noted across the board has been the value of old domains. Across every keyword field that my company has tested over the last 3 years, the relationship between allinanchor, allintext, and allintitle to SERPs has decreased while the relationship between Domain Age and SERPs has dramatically increased.

Google should know that Age is rarely a good determinator of value. Its founders were fresh out of Stanford when they began. If we applied the same filter to them, Sergei would still be working the college computer labs to pay off his educational loans. This has been greatly detrimental in fields which are research and innovation based, such as the entire technology industry and health industry. Because age is such a powerful factor, sites which are relaying new, useful information do not rank for quite some time. The information that is easily accessible tends to be archaic and generic.

This is particularly noticeable among niche, secondary or tertiary non-competetive keywords where only a handful of quality results exist. The sites that tend to retain the top positions are product pages from very old sites (the first to list on the web). A particular former client created a site from scratch in 2004 which contained over 500 articles specific to a nich topic. Each article was unique, written by Doctors, and absolutely useful. The site gained a large number of quality inbound links natural. The site's on-site SEO was impeccable. Subsequently, the site ranked in the top 10 for allinanchor, allintext, and allintitle. It is now 2006. This site continues to rank #284 for the niche keyword in google. #1 in Yahoo. #1 in MSN. For the top 100 keywords that my clients are targetting, you will not find 1 website created after 2003 in the top 10 - regardless of allinanchor, allintitle, and allintext rankings.

The result, just what Google wants. The client faithfully pays Google $1.13 per click for this term.

I like to think of search engines as mega Democracies. Each web page / site is a voter. They can vote as many times as they like, but the more they vote, the less their votes count (number of outbound links). If they know the candidate personally (same IP, same website, same Class C), their vote counts less. If they are experts on an issue, their vote counts more (theming). Domain and links aging though? They are incumbency, the bain of Democratic existence. They are the popularity contest that kept the smart transfer student from ever becoming Student Body President. They are the George W. Bush's who get nominated because upwords of 10% of voters think he is his father. The result, everyone has to pay the marketer to get their name out to fight the incumbent - the marketer is Adwords.

(3) The Sand Box

While there is and always will be question regarding what exactly is the sandbox, how long it lasts, and how to avoid it, I believe that there are some serious value questions which Google must answer in justifying its usage. For our purposes here, the sandbox is generally considered to be a method of preventing new sites from acquiring links abnormally fast and moving to the top of the search engines. Ostensibly, this is simply to prevent sites from using aggressive SEO tactics to launch new sites to the top of the SERPs.

I use the word "ostensibly" because the sandbox simply does not accomplish anything that could not be more accurately accomplished using traditional link verification methods. Filtering out link spamming techniques and using percentages of outbound links relative to inbound links are simple techniques which would counteract the majority of these attempts to boost rankings quickly. Improving these methods would exclude illegitimately acquired links.

Links-aging, often identified as the major culprit behind the sandbox, is simply not an accurate method of determining the value of a link. On the contrary, if the other techniques mentioned above are applied, the only links remaining to be questioned by a links-aging filter are those legitimately acquired. Thus, a links-aging filter tends to only harm sites with a large number of legitimately acquired links, such as phenomenon sites. This has been documented in several cases (Christopher Walkin 2008 for example) and has driven users on many occassions to use Yahoo and MSN to find phenomenon web sites.

What the "sandbox" and "links-aging" filters have accomplished, though, is encouraging new web site owners to use AdWords to promote their site through Google's search engine. Unlike the natural search results, deep enough pockets can secure a top ranking for any keyword in 1 day.

Conclusions

Generally speaking, an accurate linking-scheme filter is all that is needed to make a good search engine algorithm. The beauty of Google's original vision is that it turned the entire web into a giant Democracy - a hand audited system of determining the value of a website. All Google has to do is remove voter fraud.

The filters mentioned above are crude shortcuts that have improved quality marginally at best while excluding a large amount of quality content and ignoring a large number of new quality web sites. Their impact on AdWords revenue (just looking at my own client's participation) is far greater than their improvement of search relevancy and importance.

Google needs to focus on its original methods. A good link-scheme filter would encourage SEOs to do the only thing they can to get good, natural back links - write good content with open-republication policies (Yes! Encouraging other webmasters to duplicate the content with links back to the original - think quotes in an academic publication! This is good Google! Not Bad!). This is where SEOs are valuable to Google, to the Search User, and to the Web Site Owner.

Comments on this post

googler agrees
: good taking the time making a good discussion

raz agrees
: Nice Analysis

brandall agrees
: I actually disagree with most of your points, but I respect the time and thought you put into the post. I'm sure it will spark a good conversation, and that is worth some rep.

newbieuk23 agrees
: I too am sick of Google's continual crusade to keep all but authority sites out of the SERPS

normanlister agrees
: Bit of a mixed bag in terms of insight (disagree with a lot) but as a stimulus for debate this is superb effort!

Google is a moving target. The website is their property and they can do with it as they please. IMO *what* they are doing or *how* or *why* is not the issue because we have zero control over it. It's their finances and their ethics - not ours.

We do have control over our own sites and that is where we should focus our energies.

If you have a site that is a real success then most of your traffic will come from bookmarks, type ins and links from related sites. Build your biz to be independent of search.

Comments on this post

googler agrees

rjonesx disagrees
: Why do you even hang out in this forum? Its about SEO. Your answer is like telling McDonalds if their food is good enough, they wont need to advertise.

brandall agrees
: rjonesx, I think it is you who missed the point. Sites that don't need search also tend to rank very well. It is very Zen, really.

internex agrees
: Well said brandall - SEO should be part of a conculsive online marketing plan, not the only component

crxvfr agrees
: Mcdonalds? IMO Egols has always had well grounded, objective mature and original posts. He thinks for himself. Look at his rep.

jasondmce agrees
: McDonalds sucks, which is why they spend so much money on advertising. Yet, some of the best restaraunts in the world don't spend a penny on advertising. They get it free through word of mouth. Based on this analogy, Egol makes a very good point!

Last edited by EGOL; Mar 8th, 2006 at 03:17 PM.

* "It's not the size of the dog in the fight that matters, it's the size of the fight in the dog." Mark Twain
* "Free advice isn't worth much. Cheap advice is worth even less." EGOL

Maybe I'm in the minority but all of the businesses I support have seen contiuned improvement of their Google SERPs for the last two years. None of them are spending a dime on Adwords either.

Not one of them has seen a decline Google referrals. All show increases and some show some very significant increases. So from my perspective, I love what Google is doing. All of the sites I support focus on good unique content too, so the people visiting our sites are getting the relevant search results they are looking for. So all parties are happy: SEO firm, clients and web surfers.

I'm sure my opinion would be different if I was not experiencing the above mentioned results.

I also wouldn't describe a 5% increase in search traffic market share in 2005 (according to Nielsen.net) a failure from Google's perspective.

Last edited by europa; Mar 8th, 2006 at 03:37 PM.
Reason: Add

"It is better to confess ignorance than provide it" - Homer Hickman

Organic Lead Generation Specialist and other services by D. Clark Associates

Google has initiated several questionable algorithm changes over the last few years: sandbox, domain age, links aging, duplicate content, etc. It has frustrated many, but in the commotion of what it has done to our wallets, we rarely have considered whether or not these changes were in the best interests of searchers. I would like to comment on some of these. Please read the whole thing before you respond, there is a lot to be said.

great discussion

Originally Posted by rjonesx

(1) Duplicate Content

This is perhaps the greatest deceit committed by Google over the last few years. They have convinced us all that duplicate content is not valuable (duplicated across multiple sites or on the same site). This is simply not the case...

-> Duplicate content between multiple sites is actually evidence of its value. If an article is replicated across several web sites, it is evidence of its value to the community. Google's intention is to only show one of those results - which is a huge deceit commited against the searcher. He or she is led to believe that the article listed #3 is alone, even though it has been reprinted by 50 other publications. Duplicate content is representative of a beautiful, hand-audited network of quality content. The very essence of Google was this concept of links being like book citations. Book citations include duplicate content. They copy valuable content and cite the source. This is exactly what search engines NEED to be useful!

Why should google rank more than one instance of a source? The searcher is looking for a specific answer to his/her query. I want to find multiple sources and different views. I want to find more than one store when I look for a product. Google is trying to increase linking to the content rather than just copying and pasting the content into your webspage.

Originally Posted by rjonesx

-> Duplicate content on the same site is far more representative of providing value to the user than of trying to dupe the search engines. A user-friendly ecommerce web site should present a terms of service, privacy statement, refund description, contact information, etc. on every product ordering page. Users should not have to click off to separate pages of the site to search out what their policy is on all of these issues. But, because these policies are uniform, this kind of information overshadows the unique content on a page and often initiates a duplicate content filter, causing the product pages to become supplementals.

I have never ex[perienced this problem. I have global nav bars and contact info on many sites and have never had a serious dupe content filter. I believe something else is at play when these types of sites get supplemental results. There is nothing wrong with providing a link to your TOS, refund policy and contact info on every page rather than repeating the same content globally. It makes more sense IMHO.

Originally Posted by rjonesx

-> Duplicate content is often a side effect of a comprehensive site. A site with 25 sections included in a navigation bar, parts duplicated across the top and bottom for usability, with bread crumbs is cannon fodder for a duplicate content filter. Webmasters are relegated to hiding these sections in javascript, flash, or image maps to avoid the penalty, but in the end do great damage to usability, especially for the disabled.

Again, I believe that if the content in the body of your site is unique and extensive enough, the nav bars and navigation breadcrumbs will not be penalized. Footer links have long been ignored for the most part by Google.

Originally Posted by rjonesx

-> Unique content is often totally valueless. Take for example a site that sells car parts. They have windshield wipers specific to every make and model of cars released in the United States over the last 10 years. How many different ways does Google expect a site to say "These are the windshield wipers for a 1990 Dodge Caravan"? What ends up occuring is these pages get duplicate content penalties and, instead, sites that offer a smaller number of products end up ranking higher. Thus, users get placed onto sites where they are less likely to find other products that might fit their vehicle.

I agree with this one. It makes more sense to have a site that is comprehensive
[QUOTE=rjonesx]
The true problem with duplicate content is when it is combined with other more nefarious search engine strategies, such as cloaking or doorway pages. Duplicate (on-site) content that stands alone is far less likely to be nefarious or created with the intent of duping the search engines.

Once again, what is the result of product pages not ranking because of duplicate content? Buy your way into Google Adwords.[/QOUTE]
True

Originally Posted by rjonesx

(2) Domain Aging

A general trend noted across the board has been the value of old domains. Across every keyword field that my company has tested over the last 3 years, the relationship between allinanchor, allintext, and allintitle to SERPs has decreased while the relationship between Domain Age and SERPs has dramatically increased.

Google should know that Age is rarely a good determinator of value. Its founders were fresh out of Stanford when they began. If we applied the same filter to them, Sergei would still be working the college computer labs to pay off his educational loans. This has been greatly detrimental in fields which are research and innovation based, such as the entire technology industry and health industry. Because age is such a powerful factor, sites which are relaying new, useful information do not rank for quite some time. The information that is easily accessible tends to be archaic and generic.

This is particularly noticeable among niche, secondary or tertiary non-competetive keywords where only a handful of quality results exist. The sites that tend to retain the top positions are product pages from very old sites (the first to list on the web). A particular former client created a site from scratch in 2004 which contained over 500 articles specific to a nich topic. Each article was unique, written by Doctors, and absolutely useful. The site gained a large number of quality inbound links natural. The site's on-site SEO was impeccable. Subsequently, the site ranked in the top 10 for allinanchor, allintext, and allintitle. It is now 2006. This site continues to rank #284 for the niche keyword in google. #1 in Yahoo. #1 in MSN. For the top 100 keywords that my clients are targetting, you will not find 1 website created after 2003 in the top 10 - regardless of allinanchor, allintitle, and allintext rankings.

The result, just what Google wants. The client faithfully pays Google $1.13 per click for this term.

I like to think of search engines as mega Democracies. Each web page / site is a voter. They can vote as many times as they like, but the more they vote, the less their votes count (number of outbound links). If they know the candidate personally (same IP, same website, same Class C), their vote counts less. If they are experts on an issue, their vote counts more (theming). Domain and links aging though? They are incumbency, the bain of Democratic existence. They are the popularity contest that kept the smart transfer student from ever becoming Student Body President. They are the George W. Bush's who get nominated because upwords of 10% of voters think he is his father. The result, everyone has to pay the marketer to get their name out to fight the incumbent - the marketer is Adwords.

well I sure dont think that joe scmoe buying a hundred domains to do a dmoz ripoff should be in the results, also I think that a serious website would be around for a few years. Why should google rank every domain that pops out there above established sites with solid backlinks and aged content? They have been providing good results for years. If it aint broke dont fix it. Though I dont think domain age should play as heavily as it does. I think it should be a factor in determining the value of a sites content. Content aging also plays a factor. Even on established domains, brand new content wont always rank right away. it needs to gain links and get some click through.

Originally Posted by rjonesx

(3) The Sand Box

While there is and always will be question regarding what exactly is the sandbox, how long it lasts, and how to avoid it, I believe that there are some serious value questions which Google must answer in justifying its usage. For our purposes here, the sandbox is generally considered to be a method of preventing new sites from acquiring links abnormally fast and moving to the top of the search engines. Ostensibly, this is simply to prevent sites from using aggressive SEO tactics to launch new sites to the top of the SERPs.

I use the word "ostensibly" because the sandbox simply does not accomplish anything that could not be more accurately accomplished using traditional link verification methods. Filtering out link spamming techniques and using percentages of outbound links relative to inbound links are simple techniques which would counteract the majority of these attempts to boost rankings quickly. Improving these methods would exclude illegitimately acquired links.

Links-aging, often identified as the major culprit behind the sandbox, is simply not an accurate method of determining the value of a link. On the contrary, if the other techniques mentioned above are applied, the only links remaining to be questioned by a links-aging filter are those legitimately acquired. Thus, a links-aging filter tends to only harm sites with a large number of legitimately acquired links, such as phenomenon sites. This has been documented in several cases (Christopher Walkin 2008 for example) and has driven users on many occassions to use Yahoo and MSN to find phenomenon web sites.

What the "sandbox" and "links-aging" filters have accomplished, though, is encouraging new web site owners to use AdWords to promote their site through Google's search engine. Unlike the natural search results, deep enough pockets can secure a top ranking for any keyword in 1 day.

Conclusions

Generally speaking, an accurate linking-scheme filter is all that is needed to make a good search engine algorithm. The beauty of Google's original vision is that it turned the entire web into a giant Democracy - a hand audited system of determining the value of a website. All Google has to do is remove voter fraud.

The filters mentioned above are crude shortcuts that have improved quality marginally at best while excluding a large amount of quality content and ignoring a large number of new quality web sites. Their impact on AdWords revenue (just looking at my own client's participation) is far greater than their improvement of search relevancy and importance.

Google needs to focus on its original methods. A good link-scheme filter would encourage SEOs to do the only thing they can to get good, natural back links - write good content with open-republication policies (Yes! Encouraging other webmasters to duplicate the content with links back to the original - think quotes in an academic publication! This is good Google! Not Bad!). This is where SEOs are valuable to Google, to the Search User, and to the Web Site Owner.

The sandbox is somewhat valuable, a link that has been around for a while would be of greater relevancy than one gained a few days ago. Links come and go, and link aging may simply be a method that Google uses to not have to take so much into account when ranking a site. it would be tough on their system to rank every link the same way. My guess is if they dont rank links that are less than six months old that takes out a good majority of links from their results. Crappy links come and go, but good links stick around.

page rank plays into this, if you get a link from a high pr site the link doesnt have the same aging filter as a link from a low pr site.

It sounds to me like you are pissed about having to pay for adwords. Google needs to make money and adwords is how they do it. They can only rank 10 sites on the first page so someone is going to have to buy adwords to be seen there. When I look for car tires, I am most likely going to buy from established shops such as firestone and goodyear. This is where most people are when shopping on-line they want a trustworthy website to do business with and domain age, inbound link age, content age and overall site size and establishment are going to prove this trust.

Maybe I'm in the minority but all of the businesses I support have seen contiuned improvement of their Google SERPs for the last two years. None of them are spending a dime on Adwords either.

First, you are lucky. The majority of things that have struck sites, white-hat, black-hat, no seo, whatever, have been things unrelated to methods of improving search results. As I discuss above, the major algorithm changes initiated by Google have not accurately targeted SEO tactics. Not having a 301 redirect from nonwww to www, listing your navigation, contact information and privacy policy on all pages triping a duplicate content penalty, not having a site as old as your competitors, etc are all issues that affect rankings regardless of SEO practice. You are lucky.

I also wouldn't describe a 5% increase in search traffic market share in 2005 (according to Nielsen.net) a failure from Google's perspective.

The search industry is too young to be adequately measured by Nielsen et al. A large number of searchers are newcomers and are not yet savvy enough, or dependent enough, to find value in choosing between one search engine or another. As people become more dependent upon search in their daily activities - looking for advice, products, jobs, etc. - they will begin to choose between competitors. Additionally, the inclusion of Google in FireFox as it spread to 10% market share was a huge boost for Google usage. Simplicity, not quality.

Why should google rank more than one instance of a source? The searcher is looking for a specific answer to his/her query. I want to find multiple sources and different views. I want to find more than one store when I look for a product. Google is trying to increase linking to the content rather than just copying and pasting the content into your webspage.

I agree that it is important to find multiple views when searching, but the value of knowing that MANY sources agree with this one source is valuable. Perhaps a compromise is best by offering links under the listing (like you see for many top results these days) showing a subset of other sites echoing this article.

Justification of Aging Sites, etc...

(1) I am not pissed about AdWords. I do believe, however, that AdWords is driving their search algorithm these days.

(2) Links age does not matter if you have filtered out illegitimately acquired links.

(3) Aged content is actually BAD in many industries. It is not up-to-date, and sometimes no longer accurate. How many people go through the outbound links they have on their site to check to see if those sites have updated their content to make them relevant to today's knowledge? Aging content and links is like continuing to give more and more value to a political candidate because he has the same support he had last year. Aged sites should be held to the same standard - if they are not continuing to acquire new good backlinks, it is a sign that they should be ditched.

I still stand by my general, overall premise.

A good link-scheme filter is all that is needed to make a good search algorithm. If you devalue based on recips, 3-ways, same class c, same ip, same whois owner, same hosts, forums, comments, guestbooks, wiki spam, etc. you do not need an age filter.

Usability and Duplicate Content

The issue was raised about linking to pages with the privacy policy, return policy, etc instead of having them on the product page themselves. Multiple usability studies have shown (I would know, my wife is a Web Usability Consultant and I hear it all the time) that placing that information directly in front of the user when making a purchase is far more effective. Even putting it in a textarea (like a EULA) is not effective. Simply listing on every product page "You may return this item in 30 days if not effective for a full refund. After 30 days, the item can be returned for store credit" and "Your email address and contact information will not be shared with any third party. If you believe it has, please contact..." right before the "Check Out" button is highly effective at educating your users.

Unfortunately, these kinds of additions to a page are also highly effective at attracting a Dup penalty.

I agree that it is important to find multiple views when searching, but the value of knowing that MANY sources agree with this one source is valuable. Perhaps a compromise is best by offering links under the listing (like you see for many top results these days) showing a subset of other sites echoing this article.

I like that idea, like the indented listings, but with different sites, and a link that says for more results like this one click here or something like that.

"The search industry is too young to be adequately measured by Nielsen et al. A large number of searchers are newcomers and are not yet savvy enough, or dependent enough, to find value in choosing between one search engine or another. As people become more dependent upon search in their daily activities - looking for advice, products, jobs, etc. - they will begin to choose between competitors. Additionally, the inclusion of Google in FireFox as it spread to 10% market share was a huge boost for Google usage. Simplicity, not quality."

I'd say the numbers do not agree with you. With Internet Searches in the U.S. "increasing by 55% year to year and the growth of Internet connections being only 3%"

http://www.nielsen-netratings.com/pr/pr_060209.pdf

according to Nielsen again. I'd say sophisticated or not most web surfers are choosing Google. Regardless of what we think of the methods for delivering results. The U.S. is a mass market country. People shop at Walmarts, Best Buys and listen to Britney Spears and watch American idol. They are the masses and Google serves them well.

rjonesx disagrees: Why do you even hang out in this forum? Its about SEO. Your answer is like telling McDonalds if
their food is good enough, they wont need to advertise.

Minority opinion is in many instances the position of advantage... if you can build what I described you will have no trouble in the SERPs - you will own them.

yesy especially in business go with the out of the box, sometimes the simplest approaches provide the maximum benefit. This is true in SEO. There is no shortcut no magic bullet or top secret method The methods I learned four years ago are still many of the same methods I use today with some of my own spin on them. Granted I keep up with the changes and modify my approach slightly, but I still keep the same fundamentals. EGOL is right, if you take the time to build an authority site and give good content and make your site worthwhile then you will get more traffic from direct links etc than searchengines. However, I have not achieved this and I still need Search engine rankings.