This weekend, I read some of the research papers Gary Price posted about over at the SEW Blog.

Many of the papers that discusses link farms and link schemes discusses ways to fight them.

They all touched on how link farms are easier to detect then the new methods of link spam. In addition, they classified search engine spam into stages:

(1) 1st Generation spam was where ranking was dependent on document similarity and keyword stuffing was the spam technique that worked.
(2) 2nd generation spam was where ranking was dependent (largely) on "site popularity" and the spam method used was link farms.
(3) 3rd generation spam was where ranking was largely dependent on "page reputation" and the spam technique used was "mutual admiration societies." This is the current stage we are in.

Many of the papers discussed methods of automatically finding and then penalizing the 3rd generation spam technique. Some went as far as discussing "BadRank". Where they downgrade pages that are found within a linking network that fits the spam discussed above.

Most of the papers discuss a certain threshold (i.e. the page needs to be associated with X or more "bad" sites) to be downgraded and marked as a bad site.

Many, in the forums, feel that search engines would rarely penalize for being linked to by "bad" sites. But these papers clearly discuss how a page that is being linked to by "bad" sites but NOT linking back to any "bad" sites will be penalized.

I wanted to get a forum discussion going on a concept sometimes referred to as "BadRank".

...it is appropriate to punish links among link farms, but not good to punish some web pages.

Because if it's the other way around search engines will be making a mess of the search result's relevancy. Difficult to identify link farms are link building techniques that seems to be a non-reversible process because search engines continue to value links as one of the most important elements to the ranking algorithms.

Suppose Google (as usual) becomes the first to fight these spam links by neutralizing it's value and Yahoo, MSN and others lag implementing such actions. Will website owners stop only for Google? I strongly believe it's NOT going to be the case, because as I once heard a spammer say, "There is still so much business to be made from Yahoo! and MSN that I don't care much about Google." <-- true story!

If search engines would really be wanting to fight the web's link farm spam, in my opinion they will need to do so in a "all for one, and one for all" approach as they are currently doing with the Indexing Summit. Otherwise, it continues and it will contine to be a problem for search engines and the relevancy to the search result for users.

But the problem with that Rusty is that you are penalizing possibly some of the best sites on the web in effect because they are successful. And its not the few its the cream of the crop right across the board.

Doesn't seem very logical to me. Why not punish the scraper sites which are nothing but copies of SERPs listing the sites that are succesful?

Many, in the forums, feel that search engines would rarely penalize for being linked to by "bad" sites. But these papers clearly discuss how a page that is being linked to by "bad" sites but NOT linking back to any "bad" sites will be penalized.

If you do a keyword search on G for "leave" one of the websites you see listed is Disney. This occurs because so many p0rn sites link to it with "leave" as anchor text. So they are being linked to by a large number of "bad" sites and AFIK they aren't linking back ;-)

I can't guess at how many of these sorts of fake directories and/or scraped SERPs I've come across that list my main site; I'd never heard of them and would have no reason to link to them.

If this type of thing becomes a problem for me, since I have no control over it (and most times, no knowledge of it either), then there is indeed evil.

As an aside: a few weeks ago, I came across a site that was framing mine (and zillions of other web design sites) in order to obtain requests for services from its own visitors. I'd not heard of this site, nor had I given any authorization whatsoever either to obtain service requests on my behalf, or for the site to use my content to obtain service requests for themselves. Dirty deeds done dirt cheap.

If you do a keyword search on G for "leave" one of the websites you see listed is Disney. This occurs because so many p0rn sites link to it with "leave" as anchor text. So they are being linked to by a large number of "bad" sites and AFIK they aren't linking back ;-)

Bingo.

Bad neighborhoods should carry less or no weight, not penalize the receiver of the link. Anything fundamentally straying from that would create a free-for-all problem worse than exists now.

Bad neighborhoods should carry less or no weight, not penalize the receiver of the link. Anything fundamentally straying from that would create a free-for-all problem worse than exists now.

I agree. If someone sets up a one way link to you, what kind of control do you have over that link? None. It might also create chaos if your competitor links your site to a bad neighborhood. It is not a logical way to penalize links.

>>>Most of the papers discuss a certain threshold (i.e. the page needs to be associated with X or more "bad" sites) to be downgraded and marked as a bad site.

I don't believe you need to be linked to by other sites, or link back to other sites, to be considered a part of a bad neighborhood. You just need to have enough sites sharing similar links to be connected. It's like all your friends are friends with clowns. When you map the relationships of clowns, you will be a part of that map.

Or to make another example: Suppose an SEO Resources Website publishes a list of directories for submitting your website to. Now you have a set of thousands of SEO's, some of them with crap websites (ok, a LOT of them with crap websites), submitting their websites to the SAME SET of fifty directories.

Eventually you will produce a pattern and be part of a set because you share an affinity with the other crap websites, which is the fifty directories you dutifully submitted your website to. It's a fingerprint.

Hmm, but that means that someone could do that to you and screw your website? So what other affinities are there? Well, there are the affinities shared with the websites you link to.

For instance, suppose you read in an SEO Resources website that the best way to find links is to search for "submit url + reciprocal" or something like that? Well, who else is doing that? Other SEO's with crap websites, or people looking to game the search engines.

Now let's take it a step further. Suppose the search engines decide that people who are likely to screw with the SERPs are people who engage in a lot of reciprocal link exchanges? Why not? Aren't most people who do this doing it for gain, with perhaps a small set of webmasters wearing pointy little white hats doing it for "the traffic?" (Kind of like those weirdos who only have sex for love, not pleasure.)

So, although you may only exchange links with quality websites (such as your own, hehe), you are creating a pattern that can be recognized. Hmm... wouldn't it be mind blowing if they started looking at patterns, at the way you do things, like they could tell by the way you use your walk you're a woman's man, no time to talk?

I don't believe you need to be linked to by other sites, or link back to other sites, to be considered a part of a bad neighborhood. You just need to have enough sites sharing similar links to be connected.

Google already has "Similar pages" identified in its serps. I've thought for a while now that these could be easily tracked and cross-referenced...

The Google "Similar pages" results, which I believe classifies pages with common inbounds, could be further cross referenced and/or also correlated with hosting criteria, cross-linking, etc, and perhaps also weighted relative to the number of independent links ("clean" inbounds not traceable to a common source), and you'd have a measure of affiliation that certainly applies to some sites that I've seen drop like rocks.

The question of commonly used directories has been gnawing at me for a while, and I'm glad to see that it's beginning to be discussed.

Quote:

Originally Posted by martinibuster

...wouldn't it be mind blowing if they started looking at patterns...

Wonder if that might make all common patterns of a particular SEO to some extent suspect.... I mean, even if you used the same good directories rather than the same crap directories to get your sites started, if there were enough of them to form a pattern and they were always the same, after a while these fingerprints might be all over your sites as a form of manipulation. Maybe you shouldn't even start your sites off with small directories, good or bad, if you're going to use the same ones repeatedly.

I have seen enough evidence since late 2002 to show that link building "like a SEO" does not provide the best ranking results. I understand that clients want to see some type of immediate results but these are not necessarily the best tactics for a client's long-term goals. It also depends on how competitive the targeted terms are, but I have had great (consistent 1-2 positions) ranking results with less than 10 IBLs from very relevant sites on somewhat, but not highly, competitive terms such as 20-30 million results in Google. "links,links,links" is a 20th century ranking tactic and not very relevant in this century.

So you need to build links, but be careful who you build them with, and watch what the anchor text says, and make sure everyone's on many different IP Class-C's, and monitor every other site on the web to make sure they're not coincidentally doing something the same way you do it? And if the way you're doing things turns out to be the way every other site does them, you should invent a whole new way that the SE's haven't thought of, and patent it?

Mixing it all up seems to be the non-seo approach. Look at your clients competitors and some stuff every keyword they can fit in the Anchor Text. It seems that there may even be an applied semantics pick-up when G reviews the anchor text, but IMHO the more focused anchor text is still providing the better results.

The combination of the 2 would make the mix appear less focused on a specific keyword.