December 30, 2006

Blog Spam like other forms of spam is an increasing problem. Up until a few weeks ago, I didn’t have a lot of blog spam here. Maybe 2 or 3 per week. Now it’s 5 or so per day, and seems to be growing. I hate to think what popular blogs are getting hit with.

The spam never gets published, because all first time comments are moderated. But as it grows it takes a little more time to read through, and make sure I don’t delete a legitimate comment along with the spam comments.

You can deal with some of this using a .htaccess file. There is always the possibility of blocking a legitimate user with this method so I use it sparingly.

I came across a plug in for Word Press today that looks interesting. An acquaintance on another forum referred to the plug in. The plug in is Bad Behavior.

December 23, 2006

According to this article e-mail spam is up by 35% in November. I’ve seen reports on this in a couple of different forums. At the time mine was not up, but in the last couple of weeks e-mail spam has increased dramatically for my site.

One thing I do not understand. A lot of that Spam e-mail does not make sense, and does not provide a link anywhere.

Here is an example on one I received today.

Finally the real thing – no more ripoffs!

P.E.P.

are hot right now, VERY hot!
Well this is the real thing, not an imitation!

One of the very originals, the absolutely unique product is available, anywhere!

Read what people say about this product:

“I love how fast your product worked on my boyfriend,
he can’t stop talking about how excited he is with his new girth,
length, and libido!”

Linda F., Chicago
“At first I thought the free sample package I received was some kind
of joke until I actually tried using the P.E.P. Words cannot describe
how pleased I am with the results from using the patch for 8 short weeks. I’ll
be ordering on a regular basis from now on!”
Charley Mock, San Diego

Read more testimonials about this marveouls product here!

Iraq tops listreport gave no exact numbers but “There’s an assumption that people I’m saying is this budding narrative exist,” Snow said at a White House briefing. exist,” Snow said at a White House briefing.decades to come.”

Thirteen people convicted of killings,Group and consulting with Pentagon officialsU.S. troops on Wednesday handed over militia has replaced al Qaeda in Iraq as of Baghdad last year when he was a “military emir.” a year. Attacks by Iraqi insurgents and The visit came as President Bush said he intended to travel to Iraq The death sentences were carried out after

not attend.Iraq tops listcreate a fight between the president between the White House and the Joint to fail in the Middle East. Failure in indicated the weekly average had approachednot attend.That conclusion reflects some of

top U.S. commander in the Middle East, The U.S. military on Wednesday said American-ledto meet with military leaders and other officials. I expect to learn a lot. “short-term surge in troops in Iraq.

” Gates said, “we simply cannot affordFrom mid-August to mid-November, theIraq, he said, comes first. exist,” Snow said at a White House briefing.”There’s an assumption that people

Three other explosions took place commandos in Baghdad’s eastern Jadriyah Baghdad, an Interior Ministry official said.shooting and explosions from two separate casualties, according to the quarterly to meet with military leaders and other officials.short-term surge in troops in Iraq.

May-to-August period.It said civil war remains a possibility earlier Monday in a private event, did after Gates was sworn in as the Rumsfeld, who handed off his authority At the Pentagon ceremony, Bush to fail in the Middle East. Failure in
Remove your e-mail:
Iraqi civilians suffered the bulk of casualties, according to the quarterly to an Iraqi Interior Ministry official.U.S. says al Qaeda leader capturedin the key southern Shiite province of to an Iraqi Interior Ministry official. construct that Gen. Pace uses The president is holding a newsof Baghdad University’s law school

after Gates was sworn in as the announce his decisions in January.the Army’s top commander, warned that insurgents, the report said. the highest level since Iraq regained its sovereignty in June 2004. than “thickening the mix” Video)It said civil war remains a possibility

around the capital Wednesday morning, but sectarian militias jumped 22 percentof Baghdad University’s law school coalition forces captured a senior al Qaeda in Iraq leader attempting to organize what is left” of official said.conference at 10 a.m. ET Wednesday.The leader wasn’t identified, but the U.S. “Just prior to his capture, he was

how even children are unable to escape the violence Video)nation’s 22nd secretary of defense — increased 22 percent from the previous three months. the Pentagon, including proposals by announce his decisions in January. would be a “calamity” that wouldincreasing troop levels must involve more

Gunmen in a car shot dead the assistant dean As he headed for Iraq, Gates said the and others.The official said the attack, which “you know, I think an interestingshooting and explosions from two separateal Qaeda in Iraq in the Mosul region, the military said. (Full story)security control to Iraqi troops and police.

Am I missing something? How can the sender of an e-mail like this receive any value from this kind of crap with no link to anywhere?

Are Spammers now simply sending out Spam to prove they can? I get a lot that makes less sense than the one I included here.

I realize that spmmers are depending on some kind of small percentage of clicks based on the spam e-mail. They get a small percentage of purchases from those who click on a link in the spam e-mail.

If they don’t provide a link to anywhere, what is the return for the spammer?

December 20, 2006

If your website doesn’t comply with the basic rules of search engine friendly web-design, then all your attempts at achieving any success in the engines will be a waste of time: no copywriting will ever help you, and nor will any number of inbound links.

Unfortunately, the importance of building websites so they’re search-engine friendly is still widely underestimated; too many webmasters still think that a few meta-tags and a search engine submission campaign is all they need to be successful. A lot of sites are created using content management systems that give the impression that they’re specially programmed to make sure the site will never be visible to the engines’ bots! Forum scripts (even those released by the most well known vendors) need a lot of hacking work to comply with the simplest SEO standards; shopping carts aren’t much better.

And yet those standards and rules, once understood, are really simple.

Internal navigation links

All the internal navigation links should be coded as plain HTML links (examples: <a href=”page1.htm” mce_href=”page1.htm” >text</a> or <a href=”http://www.sitedomainname.com/page2.php” mce_href=”http://www.sitedomainname.com/page2.php” >text</a>). Links coded using JavaScript in any way, shape or form, are not followed by the search engine spiders. The same is true for links embedded in Flash objects. If the site has JavaScript-based links or Flash links, it should have an alternative navigation bar in plain HTML (or a complete alternative HTML version of the whole site, for pure Flash sites). Linked images can be followed by search engine spiders, but again, the code should have that plain HTML “href” part in it.

All internal links pointing to the home page of the site should link to “/” or “http://www.sitedomainname.com/”. Any sort of “index.html”, “default.asp” or “main.jsp” in the URLs of those internal links introduce unnecessary problems like the duplicate content problem and split page’s authority. It’s been reiterated and discussed thousands of times all over the SEO forum circuit. And yet, all widely known vendors of forum scripts make this same mistake over and over again, from version to version. I have to wonder why.

301 and 302 redirects

We all know that the 301 (“Moved Permanently”) redirect is much more search engine friendly than 302 (“Found”). Actually, using 302 improperly is the best way to ruin your success in search engines; cases when it should actually be used are exceptionally rare. In spite of this, most web servers are set up so that 302 is the default redirect; if you need a 301, you have to specially state it, and web developers often forget about it.

Besides, 301 itself can be your good friend if used properly, but produce a lot of harm if misused. Example: some CMSs are built so that the home page (“/”) gets 301-redirected to a page like “main.htm” every time you try accessing it. It won’t kill your site in the engines but it will add a lot of unnecessary complications and should be avoided. If your CMS is built this way, my advice would be to phone your vendor and yell (politely) at their customer support person until she contacts the developers and they fix the issue. If the CMS is developed by your in-house programmers, then yell at them (been there, done that, and from my personal experience I can tell you that they can fix it easily; if they tell you they can’t, they are just being lazy).

The worst example I can think of regarding the improper use of 301 is 301-redirecting your main domain to a subdomain on a web designer’s test domain that has been temporarily used to rebuild your website. If your designer has done this to you, you need to demand that the temporary subdomain is dropped ASAP and the content is transferred back to your main domain – or that you want your money back! Subdomains are never treated as serious entries by the search engines, and all the authority you have gained over the years will be quickly lost if you allow the situation to remain like this.

Believe it or not, but I have really seen it done – and people who did it couldn’t understand why I was so exasperated with them after noticing it. They thought they had found the best solution.

Dynamic URLS, cookies and sessions

Dynamic URLs are not necessarily bad, though static URLs are, of course, better. The “no more than three parameters in the query” rule is widely known and in most cases strictly followed by web developers. But there are two other rules, no less important, but often neglected, which would be “no “id” letters in parameter names” and “no session ID parameters for guest visitors”.

The naming convention of dynamic parameters is the easiest thing to fix; the only thing required here is to know and remember about the rule. Unfortunately, a lot of people have never heard about it; those who have usually forget about it or just don’t take it seriously. I would like to emphasise this: the dynamic parameters shouldn’t have the “id” letters in them to avoid looking to the search engines like session identifiers, which they hate.

To actually turn off the random session ID parameters off for guests and the search engine spiders while still having them turned on for logged-in users is a much more complicated programming task. But it has to be done, because the engines hate session IDs. The reason behind it is simple: since the session IDs are assigned every time the user agent comes to the website, they create lots of different pseudo-URLs for the same page. The engines would see those pages as different entries with the duplicate content that would fill their indices with tons of unnecessary information and make them sort all the duplicates out. To avoid this, they often ignore websites that show them session IDs, or don’t go further than the home page.

Another way to scare a spider away is to require that the user-agent supports cookies – and refuse to serve any content unless cookies are enabled. Since the engines’ bots don’t support cookies, they will just go away.

A few other things to remember

There are a few other things you need to keep in mind when building a search engine friendly site. These are:

frames are both SE-unfriendly and out of fashion;

if you can reduce the size of your HTML code but keep the look and feel of your page, do so;

ideally, each page should have unique title, meta description and meta keywords tags;

always keep only one version of your URL, either www or non-www; the other one should be 301-redirected to it;

valid markup may not bring you higher rankings, but it will improve your credibility and at the same time ensure that the spiders won’t have problems crawling your pages;

using client-side meta-refresh redirects is a bad idea;

if a site is going to be big, consider using folders as a structure element from the start;

spiders can’t read text written in graphics.

That’s all I wanted to mention in this article. The quality website building theme is a large one, and I’m planning on returning to it over and over again in the future. But the information above should be enough for you to build a basically search engine friendly site.

December 18, 2006

As far as achieving better SEO results is concerned, the reality these days is that knowing how to detect sites that use spammy SEO techniques is becoming more important than knowing where to place your keywords. With spammy sites growing in number at a tremendous rate and the search engines severely penalising websites for linking to such neighbourhoods, we simply can’t neglect the issue any longer.

Granted, nobody can ever be 100% sure the site s/he is reviewing is absolutely spam-free. There are a lot of different ways to spam the engines, and many of them are not obvious at all. But knowing how to detect the most obvious and primitive types of search engines spam can still protect us from exceeding the critical bad neighbourhood percentage and receiving a penalty.

So, let’s go step by step.

Most obvious

First, it is a good idea to check whether the site in question is banned from Google (which, of all search engines, is the best at detecting spam). A quick glance at the Google Toolbar, which, as we all know, has the Google PR indicator, will tell us if the PR of the site is equal to zero. If not, the site is obviously not banned. If the site is PR0, we need to do more research. Entering site:www.insertthenamehere.com into Google will tell us a lot more. Let’s say the site is neatly indexed, each page has a description and none of them has the dreaded “Supplemental Result” addition under the description (meaning Google approves of the site and, so far, hasn’t detected anything dubious there). It also means the site is well structured and search engine friendly. PR0, in this case, would mean “new site”, and after the next Toolbar PR update we will see a nice 4/10 there. Maybe more.

But if Google shows: “Your search – site:www.insertthenamehere.com – did not match any documents”, then the site is, most likely, in trouble. Possibly it just has no incoming links yet, and will never be listed in the engines at all, unless at least some links appear over time. You can easily check it by running the link:www.domainname.com search in the MSN search engine. Of all engines, MSN gives the most accurate information on incoming links. If MSN shows a few thousands links but Google doesn’t seem to know anything about the site, it is a clear case of a banned site. To be double sure, do the same site: test using the Yahoo! search engine.

There are other, less obvious, cases, which we will discuss below. For now, another very simple test you can start with is to wait until the page is completely loaded, then click Ctrl+A to highlight the whole page. If the page contains any hidden text (i.e. text written in white on a white background, or blue on a blue background) it will immediately become visible. If that’s the case, you need go no further. The site is spamming, and even if it is not banned from any engine yet, it is a bad neighbourhood.

Go advanced

Ctrl+A will only help you detect hidden text if it is written in the colour of the background. But there are other ways of making text invisible, like setting visibility to “hidden” through CSS, or placing the layer containing the text outside the page, once again using CSS. That’s why we need to look at the code. You don’t need to understand exactly how the effect of invisibility was achieved; what matters is that the text you are seeing in the code doesn’t render in the browser, even if you click on links or run your cursor over the navigation menu. Attention: if there is an event (usually programmed using JavaScript) that makes the text visible, it is not spam.

You need to look at invisible images, as well as at invisible text. Single-pixel spacer.gif images are usually not spam, as webmasters traditionally use them to keep table cells from shrinking and for some other purposes. But if this image has an “alt” attribute stuffed with keywords or, worse still, is linked to something, it is a good reason to stay away from this website. (Note though that invisible images are sometimes linked to visit counters, including some counters provided by the engines themselves, and also are used for conversion tracking scripts, e.g. for Google AdWords PPC campaigns, so be careful with your judgements.)

Watch for text sitting between the <NOSCRIPT>, <NOFRAMES> or <NOEMBED> tags. These tags are created for “disabled” user-agents, i.e. those that don’t support JavaScript, frames or Flash (a group to which all search engine spiders belong). If the text between those tags is an exact copy of the text you can see in your browser (or, alternatively, a standard “sorry, your browser doesn’t support frames” message), everything is all right, but if they contain something that is nowhere on the page, but is noticeably keyword-stuffed, it should raise alarm. Frames that are only 1 pixel in size in most cases mean abuse of some sort.

Spam that is visible

Not all spam is hidden. For example, link farms usually don’t hide links, but it doesn’t make them less spammy. Quite often, FFA pages even have the infamous acronym in their URLs, as if their creators are proud of themselves (believe it or not, most spammers are really proud of their work). So a FFA (free for all) page is easy to distinguish. It will be quite long, full of links and totally useless to a human visitor.

To research into the backlinks of a site, use the MSN search engine and the link: command. Quite often, you will see that some sites have built their inbound links by submitting to linkfarms (it is very funny to view those identical sites, one after another, sometimes with different background colour but still looking like twins and even having identical page names). Other sites obviously bought their incoming links, and to keep the prices low, agreed upon hidden links. If you are willing to dig really deep, you might want to check the whois information. If you are seeing that all domains linking to the original site also belong to the same owner, just stay away.

Often, you will see that the same webmaster has created 40 or 50 websites, and linked each page on each site to all other sites (usually at the bottom). Sometimes, people who do so sincerely believe they are doing nothing wrong, and are just trying to improve their web visibility when in fact they are asking for a severe penalty from Google and Yahoo! You are much safer if you don’t link to such sites. Too many outbound links at the bottom of the website’s page should immediately put you on the alert. It can only mean a heavily cross-linked network or, alternatively, a site blatantly selling text links for the sake of PageRank.

Does the text look weird to you?

Another example of visible spam is keyword stuffed copy. It is very easy to spot, because repetitive keywords look weird to the eye. If you look at the copy and feel like there is something wrong with it, read it out loud but be careful you don’t sprain your tongue!

Doorways and cloaking

These are harder to spot. If the site is small enough, you can again do the site: search and go through all pages, especially those marked as “Supplemental results”. Often, Google’s smart filters mark the doorways as “Supplementals” because they can detect something wrong with them.

The doorways are usually either meaningless or nearly identical to each other (only the main keyword changes when you move to the next page). The file names often repeat the targeted keyword. Some doorways will redirect you immediately to the home page of the site; others will just contain a link you should click. The doorways themselves will be shamefully linked through an invisible image or something looking like an element of the graphic layout. You can come across such a link quite by chance.

Of course, the bigger the site, the harder it will be to check for doorways and hidden links to them.

Cloaking is harder still to detect. You can compare the cached version of the page with the one you are actually seeing. If there is a considerable difference, chances are the page is cloaked. But it can also mean that the page has recently been changed, and not yet re-indexed by engines.

For Firefox users, it could be a good idea to set the “user-agent” parameter into “Googlebot”. Some funny sites can be found this way.

To check the cached version of the page in Google, use the “cache:” advanced operator with the exact URL of the page. If you have the Toolbar installed, you can also right-click the page and select the “Cached Snapshot of Page” option in the popup menu.

Some sites will turn the caching off through meta-tags. Since it is usually done to hide the fact that the site is cloaked, just stay away from such sites if you wish to be double safe.

What else the engines can tell us

The same site: search in Google can give us a lot more information about the site we are checking. We just need to know how to look at it.

For example, the “Supplemental Result” mark can mean a lot. If the site is dynamic in nature, the Supplemental Results will appear now and again to mark auxiliary pages that don’t contain much useful information, or pages that can be accessed through different URLs (a situation very common with forums, for example). It can also mean that the page is physically deleted from the web server by the site owner, but not yet dropped from Google’s database. But in many cases it also means one or another type of spam is involved. When all pages are reported as Supplemental Results, there is a good reason to suspect the site is approaching a permanent ban stage.

Apart from Supplemental Results, you can often see so called PIPs (Partially Indexed Pages) in Google’s SERPs. It means, the page is listed by its URL only, without any description at all. If there are too many PIPs on the site:www.domainnamehere.com listing, it is a very bad sign. But before shouting “Spam!” you might want to check the robots.txt file. If you see that pages in question are disallowed from indexing, it explains the PIPs. Again, dynamic sites often show a lot of PIPs just because of their dynamic nature.

To link or not to link?

The procedures described above look like a lot of work, and those of my readers who have managed to read the whole article are probably asking themselves, “Wouldn’t it be easier to just not link to any sites at all?” It certainly would. But while the sites that link to numerous bad neighbourhoods face the risk of a penalty, those that don’t link out at all can be regarded as poor quality resources, because they work against the very spirit of the Net.

The Internet is supposed to be interlinked, so that every surfer, including inexperienced newbies, can intuitively find all the information on a subject simply by following links. So, making your site a dead end is not the best solution, even though it is the easiest one.

I would still recommend site owners to link out and to do it freely and generously. Actually, when you get used to the procedure it is not too much work to check a website for the most obvious spam techniques. Soon, your intuition will reach the point where you will be able to “smell” spam and from the look and feel of the site immediately tell when one is likely to be spammy while another one is most likely clean. I can’t explain this effect but it exists.

Are you ready? Good. Now, let’s go and build the web. The White Hat Web.

December 13, 2006

Michael Martinez who is well know in the Search Engine Optimization industry has recently started a new blog SEO Theory . I noticed this morning that he had written about the value of SEO Tools a couple of days ago.

Search engine optimizers love their tools, and for the life of me I don’t know why. Most of the tools I have seen through the years are pretty much duds and losers. The basic SEO tool these days is some sort of backlink checker. If you’re “lucky”, you’ll also get Toolbar PR reports for each link.

Michael does find some value in a limited number of tools, which he names.

He also deals with some ways a person might get an idea about their number of backlinks in Google.

Forum owners and moderators know what I’m talking about. The number of unwanted self-promotion posts in forums seems to be going up, just as the amounts of the email spam. Do people really believe that spamming forums is the only way for them to earn their living? And how long will it take before frustrated forum owners close all their forums just because they will no longer be able to handle all the spam?

As some of our readers know, both Connie and I are Super Moderators at the IHY SEO forums. So, Connie can confirm: spammers are keeping all the SuperMods quite busy there. And another thing: before, they would just come, make something like 3 to 15 spammy posts and depart. Now, they stay and keep posting 20, 30, you name it, posts, until banned.

I also run a small board intended to tell the English speaking world about my home town (as most available resources about Saratov are in Russian). Small as it is, it has attracted a lot of spammers, most of them, I suspect, use bots, come from different IP addresses every day, and some of them (what really drives me mad) even attach pictures of such a kind that I have to moderate their postings with my eyes closed (not an easy task to do). So far, I have been able to keep it clean (3 or 4 spam posts per day to remove is not hard), but when their number grows to 100 posts per day, I’m afraid I will have to take the board down.

December 7, 2006

Search Engines are the most effective tools for bringing traffic to websites. The searcher uses them to find a source of information or a service. The owner of the site that contains information or represents a service provides everything that is necessary – and the search engine lists the site and ranks it high for searchers to click on its link in the listing. That’s how the interests meet.

Of course, the reality of the search engine world is not that simple. These complicated mechanisms have a lot of factors to estimate and a lot of variables to calculate before they “decide” which site of the vast number of apparently equally relevant resources to make number 1. And they will tell you nothing about those variables, nor will they simplify your task when you are trying to affect their decision and achieve good positions in search engines for your targeted keywords.

The art of SEO is not any simpler than any other art. It takes talent and labour. And it takes certain knowledge of some rules and secrets that may seem easy and obvious at first. But why do people find them hard to accept and follow? Why do they come to numerous SEO forums and ask the same questions again and again? Maybe they just aren’t really obvious?

Questions and answers

1. How do I start?

You start by asking yourself what your site is going to be about. You define your goals and target audience; then you just shut your eyes and concentrate on your own ideas on how your future site should look and how it should be structured. Keyword research comes later; if you manage to answer the main questions, which are “What kind of pages am I going to have on my site?” and “How will those pages be connected to each other to form a site?” it will be much easier to find the right keywords.

Of course, if you are not creating a new site but working on an existing site, you don’t have the luxury of performing this preliminary meditation. You just look at the pages you’ve got to optimise and find the best keywords for them.

2. Do I need to target all my keywords on the home page?

By no means. You’ve got your whole site to play with, and the opportunities this gives you in terms of optimising for different keywords are tremendous. All pages from your site have their chances to be ranked and found, and these chances shouldn’t be neglected.

There is one important consideration though. Your home page is always the one with the greatest authority, from the search engines’ point of view, and with the best chance of achieving decent rankings for appropriate keywords. So, it has become common practice to target the most competitive search terms on the home page.

3. How do I determine a competitive term?

That’s easy, if you are familiar with Google’s advanced search operators. The allintitle: operator combined with your targeted keywords will show you the number of pages that have your targeted keywords in their <title> tag. The allinanchor: operator brings forward the pages containing your term in the anchor text of their inner links. You can also use both of them in the same query to refine the results of your research. Combined with the grand total of all the pages found for the term in different engines, they will give you a rough idea of the level of competition in your targeted SERPs (search engine result pages). Of course, there are other factors affecting the competition, such as the niche (for instance, the SEO niche is very competitive by definition), the quality of the sites already sitting in the top 10, and such like. You will have to research into all these factors to get a better idea of what is waiting for you – and yet only when you start actually chasing your keywords will you see the whole picture.

Don’t be intimidated though. When you start actually doing SEO for your site, your web statistics will show you a lot of niche-specific key phrases you have probably initially missed – and they will still be effective and well targeted.

4. Do I need to include my keywords in all the alt attributes of my images?

Hey, of course not! What gave you that idea? Imagine yourself a visitor browsing the site with graphics turned off (spiders are just like those visitors), and you will see why spacer.gif images should have nothing in their alt attributes (just alt=””), and a picture with a white horse on it should have alt=”horse” or alt=”white horse”, but not alt=”gifts gift baskets birthday gifts wedding gifts buy gifts online”, even if you are actually targeting those keywords. You should never sacrifice usability for the sake of keywords or SE positioning. Ever.

5. My successful competitor is doing this and that on his site, and it brings him good rankings. Should I do the same to achieve the same results?

No, you shouldn’t. To make your SEO the most effective it can be, you should be original and unique, and never copy anyone at all. The search engines see their goal in delivering their users the best quality results possible; and being unique means quality (provided you offer something of genuine value). Those who imitate and copy get filtered and devalued by the engines, so don’t yield to the temptation to reproduce the next guy’s work on your site. Just be yourself.

Of course, basic SEO rules are to be understood and followed. Keywords in the copy; keywords in the title; keywords in the link text, and make sure your site is easy to navigate and use. But at the same time, you should always remember that SEO is an art, and every art requires all the creativity and imagination you possess, in order to be successful.

The main ingredient

If you really aim at being successful online, you should think about long-term success and never expect quick results. If your site is new, be prepared to face the Google sandbox. It may last for less than a year, but if you are ready to wait for a year or more, you won’t be too disappointed if it actually happens to you. Patience is going to be the main ingredient of your overall SEO success, as it always has been, and always will be.

Year after year the Net grows, the algorithms become more sophisticated and the engines will slow down their reaction time. Year after year, SEO work is going to require more patience and more dedication if we really want it to be successful. It’s the most important secret of effective SEO there is, and though it sounds simple, we all know how hard it is at times to stay patient with the engines.