Overview

All GNHLUG mailing lists are currently hosted on the server liberty, and run via Mailman. As of this writing, all the lists are open to self-subscription, and most allow any subscriber to post. For most lists, mail from addresses not subscribed is silently discarded. The lists get too much spam to do otherwise, baring some kind of sophisticated anti-spam solution.

List administration

Admin work is generally done via the web-based admin interface for each list. See the list of list admin pages. A link to each list's admin page is also present at the bottom of each list's listinfo page. The admin passwords are not documented on this public website. If you need access, contact one of the existing list admins; if they trust you, they will presumably give you the password.

Currently, all the lists have the owner/admin set to listmaster at the GNHLUG domain. That is an alias, which
currently just goes to BenScott.

Announce List

Posts to gnhlug-announce are restricted. This is done mainly to keep mis-directed messages from annoying people who just want announcements.

To accomplish this, the list is configured to set every subscriber to be moderated by default. That causes all postings to be held for approval. Posts are thus held for approval by a list admin. The admin can optionally also clear the moderate bit from the poster's address, so they can post future announcements without admin intervention.

This does mean an address needs to be subscribed to gnhlug-announce to post. At the least, subscribe the address and set the "nomail" bit.

List archives

Lost mail

Between the lates of Sat 14 and Thr 19 in Oct 2006, some mail may not have made it into the archives. Details. It should be possible to track down the lost mail in third-party archives and inject it back into Pipermail now that things are working (?) again. -- BenScott - 19 Oct 2006

Message splitting during archive rebuilds

There is some minor breakage in the list archives. As part of the migration process, the archives got rebuilt. During an archive rebuild, a very small percentage of messages get parsed incorrectly. Specifically, a line beginning with "From", preceded with a blank line, gets treated as a new message, regardless of any other surrounding syntax (or lack thereof). This means a handful of messages get broken into pieces. The parts after the "real" message dodn't have any valid headers, of course, so they get filed under the current month. It should be possible to rebuild the archives again, using a better pre-processor (like formail) to recognize messages. The details I'm unsure of.

Better archive software

Pipermail works, but it's simplistic and limited. I've seen much better archiving systems/UIs. It would be nice to use one of them. In particular, we would like something that spam-guards all email addresses (including those in body and non-standard headers), not just the From/To/Sender headers that get spam-guarded now. This would let us make the archives fully public again.

Obsolete stuff

All of the following is obsolete, but is preserved here for reference.

Queue Management

Messages from non-subscribers are held for manual disposition. Since the list posting addresses are in circulation on the Internet, they get spam. The queue of held messages thus quickly accumulate spam. They need to be cleaned on a regular basis -- at least weekly, sometimes daily. Failure to do so results in the queue growing to unworkable proportions depressingly quickly.

To clear the queue: Log in to the admin interface, click "Tend to pending administrative requests", and deal with the resulting pile of crap^W^W^W list of messages. Most messages will be spam. So you generally go along, set each message to "Discard", and then click "Submit".

Quick Links

For convenience, the following links to the various queues may be useful.

Tip: Put the above links in a bookmark folder in Firefox, and use the "Open in tabs" option that appears at the bottom of the resulting bookmark menu.

Discard-All Bookmarklet

Most messages end up being discarded. One quickly wishes "Discard" were the default, rather then "Defer". The following bookmarklet will accomplish that; it sets all the checkboxes to "Discard". Of course, you still have to weed through all the spam and set any non-spam to "Approve". A little caution is in order; it is easy to accidentally discard a useful post this way.

The final shape of the script bears almost no resemblance to the ideas in the discussion. The final approach is based on the principle of "Do No harm". No false positives are allowed, no submissions approved without direct human intervention, no notification emails can be generated that might be sent to forged addresses, must not interfere with the existing system's function and the tool must not put extra load on the servers.

Current approach:

When Mailman gets a non-member submission it sends an email to gnhmod@Jeff's domain activating a procmail recipe which kicks off the script.

A curl command is used to fetch the gnhlug mailman pending requests page, containing fragments of non-member submissions to the list.

The script loops through each submission, extracting the fragment and using bogofilter to score its "spaminess".

If the score is twenty percent higher than the cutoff for spam, the email is automatically discarded. Because the spam going to the list is very uniform this results in more than 90% of the spam being discarded immediately despite the 20% higher score requirement.

Low-scoring, non-discarded fragments are kept until a human can review them using the same tool in a different mode. The human can discard the spam and/or feed the spam into bogofilter so that any more of the same spam will be identified by the system.

The approach is described here so that it can be reviewed for potential problems. From its it activation to Jan 1, 2006, this system was carefully watched. It seems to be performing well and will not be so carefully monitored. If you notice any real or even potential problems with this approach please let Jeff Kinz know immediately.