The attached program and Visual Studio 2010 source code is by Richard Twyning in the UK.

What it does is mass replacements of strings of text with other strings of text contained in one or more text or HTML files, or any file that's plain text.

The Book1.txt file is a sample of how the replacementfile.txt is organized.

However, that file is too long for the program. On exit, the program re-sorts replacementfile.txt by the character values of the odd numbered lines, which are the to-be-replaced strings in the source file(s). If there's too many lines it truncates the file.

Another issue is it doesn't work properly with Windows 1252 (or Extended ASCII) characters from 128 on up. It replaces them in replacementfile.txt with some non-printable character, but only when the replacementfile.txt file is over some unknown length. I tried removing all the codes for the ten digits and the regular upper and lowercase letters but it still fouled up characters higher than number 128, but didn't truncate the file.

As it is, as long as the replacement file is kept short enough, it works.

If I can get Visual Studio 2010 I'll take a stab at curing those bugs. I posted the program and source here in case anyone else wants to give it a go or use it as an example or part of some other project. Richard based this partly on other open source code.

The goal for my use is to have a replacementfile.txt that has every possibility of UTF-8 code in it, with and without leading zeros and all cases of where more than one code corresponds to the same Extended ASCII or Windows 1252 character. Then it'll be a much less time consuming job to convert UTF-8 encoded files for further conversions for any platform that does not support Unicode. (In my case that means Palm OS!)

With the replacementfile.txt bugs squashed, this program could be very useful for many other purposes. It does work fine in a limited fashion now, good enough for processing HTML files to run through Mobipocket Creator, which so far is the only Mobi converter I've found that properly groks Windows 1252 encoding using extended characters and will not foul it up in the conversion by assuming the output ought to be Unicode.

One idea I had was using it for simple OTP cipher encryption. Pass a source document through this several times using a series of replacement files, send it to another party, who uses another set of replacement files in opposite order to decode. Like with all OTP systems, getting the decoding code(s) to the destination party securely is the problem.

Another use could be doing replacements in source code files, especially if you need to make huge numbers of changes. Find and replace in most text editors can do one change at a time on every instance of that single change. This program can do many different changes on every instance of all the changes in a single pass.

Anyway, I posted it here so people can make of it what they will and perhaps expand it or find it useful in some way.