These articles are not intended to be a comprehensive look at custom control development (there are 700+ page books that barely cover it), but they do cover a significant number of fundamentals, some of which are poorly documented elsewhere.

The intent is to do so in the context of a single fully reusable and customizable control (as opposed to many contrived examples) with some awareness that few people will want many parts of the overall article but many people will want few parts of it.

This article looks at techniques for customizing the NoSpamEmailHyperlink to make it unique for any given page without downloading and editing the code directly, making it difficult to incorporate any future improvements to the base control.

It assumes at least a basic knowledge of C# and class inheritance.

Customizing the NoSpamEmailHyperlink

With a few simple overrides, it is possible to completely change the nature of the NoSpamEmailHyperlink so that each site implementing the control uses it in a slightly different way, making it nearly impossible for email harvesters to detect and decode the email addresses. It is even possible to use numerous derived controls on the same page (as seen in the screenshot above), but this is not recommended.

The aim of this control is to force harvesting software to handle JavaScript as effectively as any browser and thus push up the price of such software, in turn pulling down the profit margins of the email spammers. If all they have to do is detect the link array, or the code key and either use it or move on, this would not cause them too many problems.

If, on the other hand, the control acts slightly different on some sites and very different on others, the email harvesters are going to have to work a lot harder.

The NoSpamEmailHyperlink offers six properties and three methods which can be overridden to create a completely different control with the same principles.

Custom Coding Key

Replacing the encoding string is easy as long as you follow a couple of simple rules. Create a new control, derived from NoSpamEmailHyperlink and override the protected .CodeKey property.

It is essential that you never include the same character twice. This will confuse the decoding algorithm. If it finds the duplicated character, it cannot possibly know which character it was translated from.

It is also essential that you only include alphanumeric characters, unless you are also overriding the Encode/Decode functionality to handle it. Any other characters may translate a valid email address to an invalid one (for example, if the first character becomes a hyphen).

It is not essential to include every alphanumeric character in the key. Missing out one or two characters can actually make it more difficult to decode. For example, if you miss the "a" and "A" characters out of the key string, all other characters will be substituted except for the As. Once you realize that a string is encoded using substitution, the last thing you expect is for some characters not to be substituted at all. And yet, the decoding algorithm will handle the missing characters correctly.

Custom Variable Names

Should the NoSpamEmailHyperlink become excessively popular, there are a number of ways in which a harvester could identify the encoded hyperlinks and discount them, or even decode them without using JavaScript.

Because the NoSpamEmailHyperlink uses GetType().Name to build the array names, function names and global-level variable names, any control derived from it will automatically use different names to avoid clashes.

However, a harvester could easily look for arrays with names ending _LinkArray and discount any links with IDs found in those arrays. Without too much more effort, it could find the _SeedArray and the ky variable and attempt to decode them.

But if we change the names of those variables on just a few pages, the process of detecting them becomes a lot more difficult.

As you can see, it is not entirely necessary for these strings to be related in any way to their function. For example, the above code changes the name of the seed array definition so that it resembles the following:

var NoSpamEmailHyperlinkExTexasHoldem = new Array("23");

You may know what this means, and the calling script will adjust itself to find the new array name, but the harvester will no longer find the array simply by hunting for _SeedArray.

Note that all of the above properties, except for the CodeKeyName are used in the JavaScript at a global level. It is always advisable to use GetType().Name somewhere in the definition to allow for further controls deriving from yours and failing to override these properties.

Custom Coding Algorithm

For the more adventurous tinkerer, it is also possible to override the .Encode() and .GetFuncScript() methods to provide a completely new algorithm for encoding and decoding the email address.

The new algorithm may be as simple or as complex as you like. Just because your favorite algorithm is simple, do not assume that this is a bad thing. As long as it is different, it is more confusing for the harvesters.

Maybe you want to make a simple change, such as accelerating the rate of change in the base number (initially the seed). Simply copy the code as described in NoSpamEmailHyperlink: 3. Email Encoding and Decoding into your derived control and amend it however you please.

Only the highlighted lines have been changed, but this is a massive change to the coding algorithm and an extra JavaScript command for the harvesters to understand.

Conclusion

The variations on this theme are limited only by your imagination. You could use multiple keys, perhaps one upper-case and one lower-case key. Perhaps you want to substitute underscores and hyphens, prefixing with a random letter to keep the address valid.

You could simulate the World War II "one time pad" system, by "adding" the first letter of the email address to the first letter of the key, the second letter of the email address to the second letter of the key, and so on.

You do not have to limit yourself to substitution algorithms. You could reverse the characters in both the user and domain segments of the email address (e.g. pdriley@santt.com becomes yelirdp@ttnas.com) or use a more complex transposition algorithm.

It really makes no difference what approach you take, the more people that add their own personal touch to the NoSpamEmailHyperlink the more painful it becomes for the email harvesters.

Let your imagination run wild.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

Paul lives in the heart of En a backwater village in the middle of England. Since writing his first Hello World on an Oric 1 in 1980, Paul has become a programming addict, got married and lost most of his hair (these events may or may not be related in any number of ways).

Since writing the above, Paul got divorced and moved to London. His hair never grew back.

Paul's ambition in life is to be the scary old guy whose house kids dare not approach except at halloween.

Comments and Discussions

My site uses datagrids for member/team contact information including email links, so this control is very helpful in fighting spam and preventing content plagerism. The datagrid IE contect menu "save to excel" makes it too easy for people to grab my content.

My datagrid implementation did create a limitation. My table row email field contains multiple email addresses for each member using ";" as a separator.

The NoSpamEmailHyperlink custom control DLL is only encoding the first email address in the string. Do you have any tips on how I could get it to process all the email addresses in the delimited string?

I do keep thinking about doing this as an extension, but I have about 5 million other things higher up my to-do list.

I don't think it would be too difficult: rather than defining the Email property as a single string, you should be able to use a Collection and custom editor (maybe StringCollection already has one, I don't know).

Replace the rendering process with something that takes each string, encodes it and builds a string[] array, then use String.Join to link the semi-colons in.

Then replace the javascript with a piece that splits the mailto: string on semi-colon and decodes each part of it.

That's roughly how I was going to go about it. I hope it helps you; let me know how you get on.