Most of my plans for additional scripts are stalled while I contemplate a solution to the question I posted (in another thread) regarding how to determine into which mailbox a filter has placed a message when the ultimate destination is not the original message destination.

My only current workaround is that each mailbox would have to have its own hard coded version of the script. Workable but not my first choice.

Let's talk about the script, MoveIPAdrToStartOfFile
Currently, if the message meets certain criteria, I automatically add the IP to the SpamAddress.txt file via the AddIPAdrToAdrFile script.

What I want to do (I think) is instead to run the MoveIPAdrToStartOfFile script to ensure I have no duplicates.

Or, run both scripts?

BTW, nice work on these scripts, I think it will ultimately make a big difference. Since I can't ensure that Barca uses the banned domain lists, I may modify your scripts to read the banned domains and filter as above.

You can use MoveIPAdrToStartOfFile instead of AddIPAdrToFile, I do. Move... is equivalent to running Remove... followed by Add...

Some of the scripts were learning experiences in order to be able to write the final scripts I actually wanted. MoveIPAdrToStartOfFile is an example. It combines AddIPAdrToFile and RemoveIPAdrFromFile. Whether or not MoveIPAdrToStartOfFile finds an existing copy of the IP address of the current message in the file, it will add the IP address of the current message to the file as the first entry.

A suggestion if you intend to use InsertAdrLineNumberIntoBody. Call that before MoveIPAdrToStartOfFile, otherwise the result will always be line 1.

I have modified InsertAdrLineNumberIntoBody since I posted it. Since the Insert... script is purely to passify me that the SpamAdddress filter is, or would be doing something, I added code so that if the address is not found it says, "IP Address of this message not found in SpamAddress.txt". If anyone would like this version posted let me know. This makes all messages have something to show me the script was run.

Again, the only messages I run the Insert... script on are messages that could be immediately deleted instead of being placed in a separate mailbox so altering or corrupting, (still has only happened once) them isn't a concern. The script provides the following information:

My hit rate for the SpamAddress filter has been quite a bit higher in the past few days. Several each day. One at 503, 1613 and a few in the 2 and 3 hundreds. The rest were less than 100. However, all of these messages were caught by SpamEntire instead. SpamEntire is a keyword filter. Far more effective than the Bayesian filter and no false positives. It does however require regular maintenance, ie., adding new keywords from messages that make it to the Unknown Sender mailbox.

I just finished adding my latest scripts that use SpamEntire.txt and SpamAddress.txt to another system that has multiple Pocomail users. Both this system and the other are still using Pocomail3 version 3.4.0.2130 under WinXP Pro.

The process of adding the filters and script is not very straight forward under WinXP Pro. Each user has to have their own copy of the .txt files and scripts since I have so far not found a way to have a single file all users can modify [in the Pocomail subdirectories].

Also, instead of "..\SpamAddress.txt" the path is "..\users\<username>\SpamAddress.txt" or "c:\program files\pocomail3\users\<username>\SpamAddress.txt".

The script filename must also reflect the altered path. This requires each user to have a unique version of each script. A maintenance complication.

Hopefully these types of things will be addressed in a future version of Pocomail, if they haven't already been in version 4.

As I was nodding off last night I decided I needed to add more info to InsertAdrLineNumberIntoBody before going to bed.

The two results are now:

Found IP Address 1 time(s)
Line number 1083
SpamAddress.txt has 3177 addresses.
-----------------------

and

IP Address of this message not found in SpamAddress.txt
SpamAddress.txt has 3177 addresses.
-----------------------

I suppose with all the changes I should now rename the script to InsertStuffIntoBody.

I also notice that sometimes all lines of an html message aren't displayed after the insertion has been made (use the DisplayRawMessage script to see this). In the most recent file every other line is displayed. By selecting the content of the message the missing lines appear, but interestingly are not selected (highlighted).

Again, any file on my system that has had the Insert... script run on it will be deleted anyway so I may not make any effort to fix the missing line problem, other than to troubleshoot in case I use this technique for something in the future that does matter.

I had an odd occurrence today. A message was in the SpamAdr mailbox but said this:

"IP Address of this message not found in SpamAddress.txt"
"SpamAddress.txt has 3333 addresses."

Eventually I figured it out. The file within a filter checks all ip addresses within the message. The scripts I have written called by that filter only check the most recent ip address. In the case I mentioned above, the next most recent ip address was in my SpamAddress.txt file.

Also, I am thinking about writing a script that will merge two address files, without creating duplicates. The reason for wanting to combine files is that two systems, or even two accounts on the same system will develop different addresses. I would like a way to combine the lists. A by product for those who created their initial lists manually as I did, the script will also eliminate duplicates from a single file by merging it with an empty file.

Here is the script that will merge two IP address files. The script has hard coded file names, File1.txt and File2.txt. I tested it several times under different circumstances, but offer no guarantees for your use. I suggest you keep a copy of your original files using different names. File2.txt will always be overwritten. File1.txt is left as is.

Also note that this script can take a while to complete. It took over 30 seconds to process a file with 3450 addresses. There were about 50 duplicates. It is not intended to be run frequently or automatically so the processing speed shouldn't present a problem.

The basic script is pretty straight forward. Working around the search bug (see description in the script) and displaying the file sizes at the end comprise nearly half the script.

{ MergeAdrFiles - Version 1.00{ Author: Scott Taylor - February 24, 2006{{ Purpose: Merge two spam IP address files into a single file. Can also be used to eliminate{ duplicates from a single file by starting with File2.txt as either empty or non-existant.{{ Method: The script copies the IP addresses from File1.txt to File2.txt adding only addresses{ not already contained in File2.txt. When finished the script reports the number of lines in{ File1.txt and File2.txt at the start and File2.txt when complete. File1.txt remains unaltered.{{ Notes: { 1) File within a filter and LocateLine seem to have trouble with content on line 0.{ As a result this script expects line 0, the first line, to be blank. Handling this apparent{ bug complicates the script slightly around the InsertBlank section.{ 2) My 2.5GHz Pentium 4 took over 30 seconds starting with File1.txt having 3450 addresses with{ File2.txt empty or non-existant. In other words, this script can take a while. Be patient.

For anyone interested here is my statistics file after a little over 5 days.

One comment about Spam address hit line numbers. Since new addresses are inserted at line one, the same line number showing up several times doesn't mean it is the same address. It probably isn't since the location moves each time another spam message comes in.

Also, each time there is a hit, that ip address is moved to the front of the file. The next time that address is used, a low line number will be reported.

3532 Addresses in SpamAddress.txt
Most Recent Spam Address Hit line Numbers
Line 38
Line 38
Line 28
Line 674
Line 218
Line 395
Line 38
Line 22
Line 1944
Line 1328
Line 32
Line 1
Line 73
Line 30
Line 160
Line 15
Line 30
Line 109
Line 1183
Line 534
Line 241
Line -1
Line 164
Line 213
Line 203
Line 5
Line 134
Line 2
Line 7
Line 1
Line 21
Line 116
Line 169
Line 358
Line 6
Line 6
Line 5
Line 101

For those interested in how well IP address filtering is working my latest statistics file is below.

IP address filtering would catch about 15% of my incoming spam. About 2/3 of those get caught by other filters first. That is not a reflection of the failure of IP address filtering, it means the other filters are working well also. The Spam Entire, or keyword filter requires regular maintenance whereas the IP address filter can be left to itself except that addresses from e-mails that make it to the main mailbox that are spam must be added manually.

On the family system I modified the MoveIPAdrToStartOfFile script to move the message from the main mailbox to the trash mailbox. It only took two lines. I then assigned the script to a button. This allows my wife and children to add the address and "delete" the message with a single click.

Moving the message to the trash mailbox gives them a second chance in case they accidentally include a non-spam message. I assigned a second button to the RemoveIPAdrFromFile script to a second button.

I noticed a sharp increase in the Spam Address hit rate when the number of addresses exceeded around 4000. There could be two reasons for that. The spammers are mainly using about 4000 addresses, or the time it takes them to start repeating happened to be how long it took me to accumulate 4000 addresses. Either way. the hit rate increased from one every day or two to over 10 per day.

Notice also how many hits there are with line numbers above 4000 in the list below. Remember, a hit moves the address to line 1. This means each one of those high line number hits are unique addresses.

4923 Addresses in SpamAddress.txt
Most Recent Spam Address Hit line Numbers
Line 1
Line 13
Line 64
Line 4258
Line 107
Line 67
Line -1
Line 105
Line 4227
Line 119
Line 1
Line 4233
Line 50
Line 9
Line 4143
Line 6
Line 2928
Line 10
Line 4387
Line 38
Line 4363
Line 176
Line 33
Line 71
Line 6
Line 1026
Line 21
Line 30
Line 4129
Line 51
Line 4522
Line 24
Line 4533
Line 434
Line 171
Line 13
Line 97
Line 80
Line 68
Line 6
Line 4454
Line 4123
Line 4022
Line 52
Line 72
Line 493
Line 14
Line 389
Line 3941
Line 4335
Line 584
Line 44
Line 4278
Line 25
Line 4058
Line 4087
Line 3896
Line 6
Line 29
Line 4055
Line 4092
Line 4

When I saw that my keyword hit rate is over 50% of all incoming messages and the IP address filter catches 6% and would catch another 9% if they weren't caught by another filter first, I added these filters to the family system my wife and children use.

After these filters had been running for a week or two, the implications of three users with little in common in terms of e-mail started becoming apparent.

If I allowed each user to add addresses to a common SpamAddress.txt file, one user could block e-mails that another user might want.

My solution to this was that they all use my keyword list and IP addresses. These files reside in the "c:\program files\pocomail3" directory. Any messages caught by the keyword filter would have their IP address added to the common list. Since I carefully consider before adding any words this is a safe method. The words in my list are generally obscene, sexually explicit or similarly objectionable. I see no reason my wife and children should see such messages.

Each user has a button that adds the ip address of the current message to their own copy of SpamAddress.txt in their own directory c:\program files\pocomail3\users\<user name>. The same script then moves the message to the "Trash" folder. A second button removes the ip address from their individual file in case they mistakenly add someone they want to get e-mail from.

In this way my list decides what gets stopped for all of them while their own list decides what they personally don't want without affecting the other users. Two different files within a filter are used.

For the local file:
%file%:"..\SpamAddress.txt"
For the common file:
%file%:"c:\program files\pocomail3\SpamAddress.txt"

The scripts for the common versus individual cases must also have the unique file paths and names. I can post these if anyone is interested. They are essentially MoveIPAdrToStartOfFile with minor changes. I think I called them AddAdrToCommonFile and AddAdrToLocalFile.

There are now 25,000 addresses in my list. There is no noticable slowing of incoming e-mail (2.5GHz Pentium).

About 10% of the incoming spam would get caught by the spam address filter but about 1/3 of them get caught by other filters first. I determine this by running the script twice. The first time the script checks but takes no action other than to increment a hit counter. The second time is after all the other filters.

The reason I do it this way is that the content filters delete the e-mail while the address filter places the message in a special mail box that I manually scan before deleting each message. Every so often a vaild e-mail ends up there.

As can be seen by this short list of recent hit address positions in the list, the age of the address varies over the full range. Hit addresses are moved to the front so in thoery the last addresses are the least recently used. Originally I was going to delete them, but since checking doesn't seem to slow things down I never have. When it does I will.

Line 7594
Line 1670
Line 329
Line 2438
Line 2607
Line 2452
Line 326
Line 5513
Line 10812
Line 726
Line 3173
Line 2042
Line 2816
Line 4300
Line 44
Line 106
Line 3830
Line 1113
Line 19807
Line 19805
Line 76
Line 10623
Line 15844
Line 3746
Line 2315

There are now 36,000 addresses in my list. I do notice some slowing but not much.

Address filtering catches about 5% of the spam. Overall probably not worth the effort compared to the key word/phrase filter that catches about 36%. I suspect that would be higher if I spent more time adding key words.

The ones that get through are generally the ones that are gif attachments containing text images.