They may be converted as ad\..*\.adserver.com, maybe even as ad\..{2}\.adserver\.com. Of course something like ad-(us|uk|fr|de|ru|ca|se|be)\.adserver\.com works, but I'd prefer to have a generic rule since there's the additional benifit of detecting servers that may be added later.

The problem is in defining the limits of what should match/should not match. After all .* would meet your requirements for a general rule, since it will match any entry! (and you could consider that optimised)
–
cdarkeMar 28 '13 at 11:30

1

An implementation of this that looked like what you wanted (optimizing) would typically be done by building a tree. Bash (prior to the unreleased 4.3, which adds namerefs from ksh) doesn't support pointers or references, which are necessary for trees, so the facilities necessary for a sane and reasonable implementation are not present. Ignoring the shortest-possible condition, you could simply convert the . instances (or, ideally, any characters not explicitly whitelisted as safe) to [.], add a ( and ) at the beginning and end and separate by |, but, well, that's not so interesting.
–
Charles DuffyMar 28 '13 at 11:40

This looks more like a Computer Science project than a simple programming question!

I don't think you'll find any straightforward bash/sed/awk instructions to do this. You want to create regular expressions programmatically, and sed/awk are typically more suited to using regexes. I guess you'd have to look into approximate string matching and specifically, computing the Levenshtein distance between two strings.