Abuse: Keyword Spamming versus Tag Spamming

I met a person recently who stopped learning web page design in 1999 and still considers himself still an expert. You know the type. Dangerous with just a little bit of experience. He knows nothing of web standards, tableless web page design, or any modern techniques or technologies. He knows 1999 HTML and CSS 101 and that’s it. That doesn’t stop him from spouting expertise.

Listening to him rattled on how the best way to get a high page rank in search engines was to hide keywords in your page code, trying not to roll my eyes at his antiquated techniques, he still got me thinking. Thinking about the relationship of the old “keyword spamming” search engine technique versus the use of tags.

“Keyword spamming” was a technique involving adding keywords – related or unrelated, didn’t matter, you just had to get the attention of the search engines – hidden into your code instead of content. Oh, you could have content, but this method ensured search engines grabbing up all these extra keywords would help you gain search engine page rank. Thus your site would move to the top of search results by using a variety and the most popular keywords. The additional keywords were hidden with CSS or comments so they wouldn’t be visible on the page but they would be visible to the search engines. An example might be the following, buried in the code of an article on buying ring tones for your cell phone:

I thought this was odd at the time. It takes time to come up with a list of keywords to attract a search engine. You have to research which words are the most popular, and they change over time so if you want to keep using these keywords to continue to end up in popular search results, you have to update the keywords all the time. You have to make sure that you don’t repeat too many of these hidden keywords as to trigger a red flag, but that doesn’t stop you from using many variations on the same word or phrase since they are, technically different. You need to think about the keywords to use, come up with a strategy to hide the keywords from search engines, and get them on every page to maximize exposure and increase your chances of at least one page coming up in the search results.

This is a lot of work. So why not put that energy into writing articles with those keywords instead? Ah, but that might take creative effort, something much more challenging, right? ;-)

Now, move ahead to “modern” web design and development techniques where such keyword spamming is recognized and punished by search engines, knowing a trick when they see one. The same pages are now filled with tags. The same type of article on buying ring tones for your cell phone might include a list of tags like this:

What makes this list different from the first one? Sure, the tags are links, and the links could lead to more pages on this site or to Technorati, or even another ring tone cell phone service site. They are still keywords added to the content to “help” your SEO (Search Engine Optimization) and search engine page rank.

Does just the addition of the rel="tag" change the meaning of this potential blatant attempt at increasing keyword coverage for search engines? Do search engines spot the relationship attribute and ignore them as search engine spam? Or not? And if not, can your site be punished for this in some way?

Like anything else everywhere, there is always room for abuse. Part of the success of tagging, and its sister “social bookmarking”, is the reliance upon the users to maintain some form of standardization.

In other words, trust and self policing. The tag services and social bookmarking services are relying upon you, the user, to add tags that “make sense”, relate to the content, help the visitor, and don’t abuse the system. They also work hard to promote the myth that in order to be effective, the tags must link back to their services. Eventually, like everything else, there may be rules, but right now, it’s a free-for-all when it comes to tags.

We know it’s being abused. Everything gets abused. What worries me is what kind of kick back that abuse will take.

When email spammers started flooding our inboxes, filters and services were developed to sort through the content, evaluate it for spamminess, and then block or release the email into our inboxes. This meant legitimate email also got caught. Many people had to relearn how to write emails in order to get through comment spam protection and still communicate.

Blogging, Content Management Systems (CMS), and other blog and website management tools developed comment spam fighting resources to block or stop comment spam. Now, if you include more than two or three links, or any words that meet the comment spam filter requirements, your comment gets eaten by the comment spam prevention tools. You like leaving comments on blogs? You’ve had to learn how to comment without triggering comment spam catchers. And if you are intimidated by the threat of comment spam, and you don’t have blogging software with strong comment spam catching and filtering utilities, then you are also more likely to turn off comments, taking away one of the great joys of blogging: Communication.

So what is being done to check for abusive use of tags? Is there anything? Sure, if splogs, spamming websites or blogs get into tag service databases, tag services say they are working to remove them, when they find them. But what about abusive tag use?

What do these look like? How do you tell? Is it a measurement of how many tags are listed, or how the tags may, or may not, relate to the site or post content? Is there a way to check? Should they be checked like search engines are doing with keywords, content, and links? What kind of tag usage will trigger a meltdown with search engines, penalizing your site? Is this happening? Have you seen tag spam? Would you report it? How?

I’m not saying tags should be controlled. I’m not saying they shouldn’t. I’m asking you what you think about abusive use of tags, and what you think should, or shouldn’t, be done about it. And is there anything that can be done?

10 Comments

I think the abuse of tags should be, I dunno, be yhm, fought or denied on a way or something like that? But as you said, it’s a free for all on the web.

That’s something that may be abused by many, but lack of standards results often in good things, although tagspamming is a different league maybe…

In the end, people down with spamming of this kind will certainly not get serious faithful readers because the weblog is a mess probably and intended just for grabbing dollars instead of making the web a better place… so I think the offender just screws himself over by doing so.

Two things I have written and hope you would consider adding are the ‘Tags in the Head‘ plugin and a short compainion piece on how to choose keywords / tags for optimal seo. Both are highly relevant to your thesis — keyword spamming is bad! And, I agree with mpty’s comments above, if people can’t take the time to learn (and re-learn) – their site will be a mess and ultimately they’ll loose out.

Putting tags in the head of your page is another method, using them like keywords. It doesn’t help using tags as site search navigational aids.

Do not confuse splogs and abuse with looks or content. There are a lot of beautiful and clean looking sites which are very abusive. That’s how they succeed, because people judge them on the surface, not for their abusive tendencies. Yes, in the end, the abusers will lose out, but don’t confuse looks with good or bad. Evil is evil and it’s to the bone, not the surface. ;-)

Agreed. You can make a site look beautiful even though it contains garbage. I just reported a beautiful splog today, and it made my day better. I’m keeping track to see if google does anything — I hope so!

I think the definition of spam itself is the issue here. I’ve always been annoyed by people who said not to “stuff” or “overdo” on the keywords, because the fact is, you’re trying to think from every perspective to reach all possible audiences. This doesn’t mean you’re trying to spam; it just means you’re trying to reach further out. That said, if your article is about boating, there shouldn’t be a “casino” tag anywhere near it. That is spam or unrelated material and is just a method of gaining traffic (not even remotely related traffic, so I don’t understand the point, but still…). Splogs, obviously, fall into this because they tend to combine unrelated material and tags in the hopes of gaining fame/fortune.

I don’t like the thought of regulating tags at all, though, because the whole point of tags was for it to be a human dictated experience. So what I might consider spam within a tag might not be what the author intends as spam. For instance, you might see a picture of clouds on Flickr, and the tag “happy” might be part of it. You may not equate happiness to that picture, but does that mean that tag is violating any rule? Regulating tags might make them obsolete, because it would end up putting a select few in control of them or a machine in control of them. That defies the whole point.

There are good and bad in every bunch. As in every day real life, why should we restrict those who have done no wrong? That’s what we do when we make new standards. It’s not as if we’re making them for the violators; people will abuse, no matter the system. So why look to change or amend something that we’re enjoying, simply because there are some spoilsports around? Block ’em/don’t visit ’em, and move on, I say.

I’m not sure that I see the benefit in tags personally. I don’t use them on my web pages. The closest thing I use in any of my on-line material would be the WordPress “categories” for blogs. I do use the tags in my del.ico.us for the purpose of sorting pages and being able to find them again later. I seldom use the tags on anyone else’s pages.

It troubles me a bit to realize that some folks are going to abuse use of them for search engine optimization.

I generally don’t have a problem with people trying to improve the ranking of their pages, so long as their pages have some content of value. What I don’t like is any kind of technique that inflates the ranking of pages that have little or no value at all except to their owner. I get very frustrated sometimes with the number of such pages that sometimes float to the top of some of my searches.

thx for pointing out this excellent article when I asked you about an opinion about tag spamming.

still there are 2 points I would like to point out:

a) there is legitimate use of tags which leads to tag spamming: you could have a sidebar containing the top 25 tags of your blog, which is a very legitimate use, but still spams every single of your posts wand pages with this pletora of tags, which are not relevant to all your posts.

b) what about this popular shoutboxes a lot of people have in their sidebar? After analizing my logs I found that people came to my site via google searching for strange keywords, only after noticing that some ppl. chatting on my site had mentioned those keywords I understood why my site came up for some very strange keywords.

In my opinion it would be great to have some way of telling search engines to ignore content between 2 tags, i.e. I would put my top25 tags and my shoutbox into [ignore] [/ignore] and all would be ok.

As stated in the article, yes, there is a potential for abuse with tags. As I have also said repeatedly, your blog will be more successful if it keeps to a specific topic instead of spread out all over the place. If SEO is really critical to you, then focus, focus, focus.

The question that really needs to be answered is also two-fold:

1) Are you really worried about your tags becoming abusive? Or other people’s use of tags?

[…] For the web world, 2005 was declared to be the year of the tag. Bloggers and webmasters were trying to understand what it meant for categorizing their content, but most of all, just like they did with links and SEO, they wanted to figure out how to manipulate tags for better page ranking. […]