Where ADate is a parameter and the syntax depends on the DB platform used.

This marks the old records to be deleted.

Then we issue the actual delete query:

DELETE FROM tblQuarantine WHERE tblQuarantine.Expire <> 0

That deletes most of rows from the tblQuarantine (and due to the database constraints, the related records in the tblMsgs), but may leave behind some "orphaned" rows in the tblMsgs. So we then issue the following as a backup to ensure all orphans are deleted as well:

DELETE tblMsgs FROM tblMsgs LEFT JOIN tblQuarantine

ON tblMsgs.MsgID = tblQuarantine.MsgID WHERE (tblQuarantine.MsgID IS NULL)

I was today trying the script to delete all spam mail in quarantene older than 7 days.

It does not seem to work.

The database files are 6GB before and after i run the script, and there are currenlty spam mail for 18days. As I sat it to delete all older than 7 days, I expected the database files to get much smaller, so i dont think it works

You can see a printscreen of the script I ran and how it looked like when completed.

At a closer look I can see that after Optemizing the database the tblquarantene is about 120MB and that is acceptable. I cant remember what it was before because my focus was on the tblmsgs table. That is about 5GB and that does not seem to get smaller.

Do you know what the table is used for?

Is there a way to empty that one or make it smaller?

I cant even open it as it freezes, probably because its to big.

At the moment I have managed to get Spamfilter up and running again by disable the database, but its not optimal.

I seem to have this issue every time i setup this system. I have tried to setup ISP spamfilter 3 times on 2 servers and it runs fine for 2-3 weeks, then it starts to behave strange and customers can not connect and send/receive. At the same time server starts to be very slow, to a point of freezing.

If i broswe using Windows Explorer inside the spamfilter homedir and click on quarantene Explorer freezes.

Hope there are some with simular issues who has a solution to the freezing or how to make the tblmsgs smaller

tblmsgs stores the actual email that has been placed in quarantine. You can only make this table smaller by storing the messages for fewer days. If you currently store for 14 days, if you reduce it to 7 days the table size would reduce by about half.

I see from this thread you are doing your own cleanup, and not relying on spamfilter to do it so you might want to make sure it is working.

How many messages do you process a day that you have a 5gb tblmsgs table?

--------------------------------------------------------------
I am a user of SF, not an employee. Use any advice offered at your own risk.

That
deletes most of rows from the tblQuarantine (and due to the database
constraints, the related records in the tblMsgs), but may leave behind
some "orphaned" rows in the tblMsgs. So we then issue the following as a
backup to ensure all orphans are deleted as well:

DELETE tblMsgs FROM tblMsgs LEFT JOIN tblQuarantine

ON tblMsgs.MsgID = tblQuarantine.MsgID WHERE (tblQuarantine.MsgID IS NULL)

I was just searching this forum to see what cleanup functions are being run as I found a bunch of old messages in the tblquarantine.

You mentioned a foreign key constraint, neither my tblquarantine or tblmsgs has any. I did some searching of the database scripts and found this one:

This creates a constraint from tblQuarantine -> tblMsgs, so I assume that spamfilter inserts the message into tblMsgs first to generate the msgID needed for tblQuarantine.

A message record could not exists in tblQuarantine if there was no matching record in tblMsgs, except you delete messages via the tblQuarantine as tblMsgs has no date information. So that constraint doesn't really do anything, it would actually need to be the other way around, no?

--------------------------------------------------------------
I am a user of SF, not an employee. Use any advice offered at your own risk.

The foreign key constraints is however not enforced on inserts, just on deletes. This means that if a record in the tblMsgs table is deleted, then all records in the tblQuarantine table that point to it are automatically deleted by the database as well. Do not forget that if a spam email is sent to multiple users, we only store one record with the actual email contents in the tblMsgs, while we save multiple "headers" for the individual recipients in the tblQuarantine. This allows us to save lots of disk space as we only save the actual email once.

In reality however, to optimize the routine cleanup process, we do not rely on the database's foreign key constraint as that is actually cause of slowdowns, since for each delete on the tblMsgs the database has to lookup and individually delete all records from the tblQuarantine. We find it more efficient to first delete in bulk all old records from the tblQuarantine. There are no cascading deletes here, so the process is very fast. Once this is done, we delete all orphaned records in the tblMsgs table. As here there are now more "linked" records (we just deleted them all before), this process is very fast as well.

We left the foreign key cascade delete constraint as it's always better to have a good cleanup in place in case the entries in the tblMsgs table are deleted by an external process and our routine scheduled cleanup has been disabled...

How long do you estimate that query (DELETE tblMsgs FROM tblMsgs) would take to execute on say 750k rows? And how often is spamfilter running that query?

From the looks of it, out of the box the tblMsgs has an index on msgid as it is the primary key for the table.

The msgid field in tblquarantine does not have an index (or is mine missing it?).

I made a copy of my spamfilter database and tried to run the command and it took so long I just gave up at about 45 minutes.

What is the INI entry to disable the housekeeping?

I'm thinking of putting an index on tblquarantine.msgid. The resulting join is faster but still quite slow on my server (about 3 mintutes). Something like following is much faster, but does it create some other issue I might be overlooking?

The time would depend on the database server's speed and type of database (SQL Server 2005 and higher for example are much faster than MySQL). Guest-imating I'd say anywhere between 10 and 60 minutes. SpamFilter runs that query by default every 60 minutes, but that can be of course be customized and/or disabled (from the "Database Setup" tab).

Are you 100% certain that the MsgID field in the tblQuarantine does not have an index? There should indeed be one, along with several others for other fields as well.

In regards to your suggested query, I'd recommend against it. Deleting entries from the tblMsgs causes the database's built-in triggers we added as a safety-net to kick in to perform cascade deletes from the related records in the tblQuarantine, which should make the query run slowly. In addition, there isn't a parameter in it to specify how old the messages have to be before being deleted.

Thanks for the detailed feedback Roberto, we do have an index on that field. I am just surprised how long it takes even with an index.

On my server with 512,000 records in tblquarantine and 360,000 records in tblmsgs the process takes between 2 - 4 minutes with an index. With no index, you can forget about it. I tried it, but aborted after 24 hours.

Nothing further can be placed in quarantine causing an immediate backlog for servers that might try and place items in quarantine when this housekeeping is going on.

If you have multiple spamfilter servers running, I don't see any need for both (you have a primary and secondary right?) of the servers to be executing the cleanup every hour.

We have made some changes to our system and can run two different routes to do our housekeeping. Our 'regular' housekeeping now takes less than 4 seconds. A few times a day we can do a 'deep' housekeeping and it takes about 30 - 40 seconds.

Will try it like this for a while and see how it goes, our spamfilter actually did nothave a problem, just trying to make it faster as we did notice the servers getting backlogged from time to time. The backlog lasted a couple of minutes each time, so I suspect it was the housekeeping that was locking the tblmsgs table.

--------------------------------------------------------------
I am a user of SF, not an employee. Use any advice offered at your own risk.

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot delete your posts in this forumYou cannot edit your posts in this forumYou cannot create polls in this forumYou cannot vote in polls in this forum