Announcement (2017-05-07): www.ruby-forum.com is now read-only since I
unfortunately do not have the time to support and maintain the forum any
more. Please see rubyonrails.org/community and ruby-lang.org/en/community
for other Rails- und Ruby-related community platforms.

On Tue, 14 Feb 2006 16:45:10 +0530,
efuzzyone@netscape.net wrote:
> I have a request that the user engine and the wiki engines should> incorporate some sort of spam protection, ideally using images.
You should know that CAPTCHAs, as they're called, are controversial,
because (a) they're very unfriendly to the disabled, and (b) they're
actually easy to automate - assuming you can't OCR them yourself, which
is
a bad assumption these days, you just put up a porn site, display any
CAPTCHAs your spambots see to the end users, and let THEM figure it out
for
you.
Jay Levitt

Hi,
Jay Levitt wrote:
> CAPTCHAs your spambots see to the end users, and let THEM figure it out for> you.
It's not terribly difficult to parse domains out of posted URLs and
check them against SURBL (http://www.surbl.org.) I've written
proof-of-concept code to do this in perl, there's Net_DNSBL for PHP - I
can't imagine it's that difficult to port either of these to Ruby.
-- Bob
PS: I used the following to extract the TLD recognition regexes from
Mail::SpamAssassin. With a PCRE engine and a little adjustment (add
'(^.*\.)?' at the front and '$' at the back), the regexes are fairly
portable.
#!/usr/bin/perl
use strict;
use Mail::SpamAssassin::Util::RegistrarBoundaries;
print "##### 4LD regex #####\n"
.
$Mail::SpamAssassin::Util::RegistrarBoundaries::FOUR_LEVEL_DOMAINS,"\n";
print "##### 3LD regex #####\n"
.
$Mail::SpamAssassin::Util::RegistrarBoundaries::THREE_LEVEL_DOMAINS,"\n";
print "##### 2LD regex #####\n"
.
$Mail::SpamAssassin::Util::RegistrarBoundaries::TWO_LEVEL_DOMAINS,"\n";
print "##### TLD regex #####\n"
. $Mail::SpamAssassin::Util::RegistrarBoundaries::VALID_TLDS,"\n";
----
I found a bit of Ruby code at
http://www.spampalforums.org/phpBB2/viewtopic.php?t=5156 that queries
SURBL. It looks like all that's needed is decent packaging and text ->
URL -> domain extraction (see above and
http://www.surbl.org/implementation.html for more info)
#!/usr/bin/ruby
require 'resolv'
dns = Resolv::DNS.new
begin
dns.getresources("#{ARGV[0]}.sc.surbl.org",
Resolv::DNS::Resource::IN::A).collect do |r|
print r.address
# etc
end
rescue Resolv::ResolvError => e
puts "not found - address '#{e.message}' not in list, thus hopefully
not spam"
end