Improving accuracy of the Junk button by screening/training users

Anyone who has ever received postmaster or (worse) AOL feedback loop complaints is familiar with the problem of lusers using the "Junk" button for "Delete."

If you have a moderate population of students reporting every email that certain college officials send to the "all students" mailing list as Junk, then eventually, you have a problem. I need some help on some components of a possible solution (which, alas, does not involved disciplining certain college officials as to the responsible use of bulk email).

Under the Zimbra hood, currently:

When someone hits the "Junk" button, the email is forwarded as an attachment to the spam.* account created upon server installation. A nightly cron job uses that account's INBOX as a corpus for training spamassassin's bayes database.

I encourage you to "View Mail" for your spam user. You might be surprised how much nonspam is being reported as spam.

Vision for possible improvement:

1) When someone sends an email to spam.*, we shunt the mail into a "manual review" mailbox, and autoreply with a pointer to a web page with a "So you want to help us by reporting Junk Mail?" web site.

2) We use the web site to educate the user about how to opt out of "official" mailing lists, how to squelch junk mail from commonly complained-about sites including Facebook, Abercrombie, and Amazon, etc. And then we welcome their feedback about actual unsolicited bulk email.

3) They agree. They are added to the spam.* user's Contacts.

4) A filter in the spam.* account allows email from their Contacts, and only email from their Contacts, to get through to INBOX.

Questions:

1) Where exactly does the built-in vacation functionality hook in? Is it possible to write a sieve filter that vacation-replies only under some circumstances? The other obvious thing that people might want is to send vacation autoreplies only to one's contacts or local domain. Makes no sense to send a vacation autoreply to a spammer or a burglar.

2) Hmm, I guess I only have one question. General comments about this scheme?

So for the 'end user training about spam' one, let's ditch the first-click-shunt to a manual review mailbox for now, as I just foresee problems with that concept:
A) time needed to implement
b) time needed by admins to actually review the mail
c) privacy concerns

Instead, on the first click, load a page in zimbra that says something to the effect of:
"In order to cut down on future false-positives, before you're allowed to mark something as junk please review the following concepts about spam."

Then you go into:

How to opt-out of "official" mailing lists instead of using the junk button because it cuts down on false-positives & reduces mail volume all together.

How to squelch junk mail from commonly complained-about sites including Facebook, Abercrombie, and Amazon, etc, so that you aren't affecting mail delivery that others find usefull.

And point them at an external link where you welcome their feedback about actual unsolicited bulk email that they are receiving.

blah blah blah...

At the end there would be a checkbox (and whatever method you want to dream up/multiple checkboxes/mini quiz whatever) to be sure they read it.

They 'pass' and their added to the list.

And so you don't have to worry about what to be doing with that first 'possible junk/spam' email, you just say:
"Thank you, you can now go about marking items as spam via the junk button."

Of course make the whole thing enabled/disabled per cos/account, with the default set to 'disabled'.

It actually gets a little more complicated than that. It's the whole problem that on one hand, you want to pool everybody's spam to train the filters, so that we all benefit from each user's experience, but on the other hand, one user will consider the Victoria's Secret emails (to pick a hypothetical example ) to be spam while another will consider them essential news.

I would say that the only effective solution to this would be to have COS that can train the spam filters and other COS that can't.

Or maybe have a two-level Bayesian database--one global and one personal, and only have selected COS able to train the global list.

These are both easy to describe; methinks they'll be a lot more complicated to implement. . .

I think he was going for educating them to get less spam in the first place & when not to use junk/spam as delete.
What your expanding it to has been requested before: Bug 3870 - per user Spam Assassin score

DSPAM...
This is also interesting: AboutMaia - Maia Mailguard - Trac
individual & system wide spamassassin bayes training, (domain rules can override any individual's rules), amavisd-new, and it uses either two SMTP-based mail servers in a dual-MTA arrangement OR an SMTP server with re-injection capability (e.g. Postfix)

It uses an older/custom amavisd-new 2.2.1 even though 2.5.2 came out this last june -reasons are here: AmavisVersion - Maia Mailguard - Trac
But from what I'm understanding you could technically add another on top of that? "supports a wide range of virus scanners, and can use multiple scanners for layered protection/one or more virus scanners supported by amavisd-new"

I'm in the same boat - when a newsletter gets "junked" and I find it during my regular evening review. I'm replying from the spam account (created an alias called spam to reduce their confusion as to the source) with a definition of unsolicited and bulk and instructions on how to unsubscribe. Although manual it is surprisingly effective. A couple of reminders and people seem to get it pretty quick.

Hmm, after some quiet period, a DSPAM 3.8 was released in March, though the changelog shows no activity since December 2006. Not sure if anyone at Zimbra has been tracking it, or if they just gave up on it.

I like the ideas above. I can post an RFE if no one else has interest. (I recently got really busy, but that could change again.)

The moderator JoshuaPrismon (aka Lostknight) was the one who influenced dspam being included in the first place - it was some excellent work for for those who remember (all the way back to dspam 3.6.1)
josh, you been keeping abreast of the dspam world lately?