select foreign characters (word 2003)

I post this on excel forum post #379179. The macro for excel is perfect, now I want to use the macro on word. Hans gave me a lots of help.
unfortunatelly, the word macro can not be run for large file.
Any one can help on this, please.

Re: select foreign characters (word 2003)

Some additional info for those who haven't followed the Excel thread starting at <post#=379179>post 379179</post#>:

Joe has documents with a mixture of Western and Chinese characters. He wants to change only the Chinese characters (or only the Western characters) in size. I adapted an Excel macro by sdckapr for Word; it loops through the document character by character, and tests whether the AscW value of the character is within or outside the range 0..255. It works, but it is excruciatingly slow. Anyone got a better idea?

Re: select foreign characters (word 2003)

Maybe you could adapt the code in the attached? It's a demo that highlights the "non-ANSI" characters by looping through a byte array of each paragraph. It has a problem with the "smart" apostrophe, which gives an ASC value under 255 but which gives an ASCW value greater than 255. I'm not sure what the solution is for that. If I knew more about the Chinese character set, that probably would help a lot. <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15>

Re: select foreign characters (word 2003)

Andrew's For Each is indeed spectacularly faster then For x = 1 To ... Jefferson's solution is even faster than that. In a test I did on a 15 page document (before running the macro), the For x code did not finish within a reasonable time; Andrew's took only 4.5 seconds, and Jefferson's just 0.7 seconds. The result seemed correct.

Re: select foreign characters (word 2003)

Hi Jefferson,

Although I don't think this can be done in VBA or even VB (so what's the point, you ask), we had another approach that we used when doing some development using IBM's PL/1 (is that still used). The approach would go something like this:
- starting with your approach of examining para by par, we would NOT examine each char initially. The first step would have a big payoff with big paras or a lot of paras of western only characters. This first step would go as follows:
---"overlay" a Char string over the arrBytes (so it becomes another way of looking at the storage of arrBytes but occupies no extra storage)
---define a "mask" string of bytes alternating as X'FF' and X'00' with X'FF' being the odd numbered bytes (like how you do your testing of odd bytes)
---AND the mask string with the original Char string and store as Result string
---compare the Result string to a string of all bytes set to 0

For big paragraphs, this has a bigger payoff than for small paragraphs.

When doing the AND, any bits set to 1 in the original string in the odd bytes are preserved but any bits set to 1 in even bytes are forced to 0. Now for setting whatever chars you want to make bigger, you need to look at the result of the comparison and what you're trying to flag:
- If the comparison to the all 0's string is equal, then the original was all western characters.
---If you want to set western characters to be bigger, then you can do this for the entire para at one shot
---if you want to set non-western characters to be bigger, then skip this para since there are none
- If the comparison to the all 0's string is not equal, then there were some (or all) non-western characters in the original
---what you do here depends on what you want to make bigger (western or non-western). For example, if you want to make western chars bigger, you could use InStr searching for the null X'00' string (assume no even bytes have this for any characters)

In general, we found these string operations to be much faster when starting with "big" strings.

But I don't think this can be done with VB or VBA so it's somewhat academic here.

But it was a useful exercise for us since CPU time was a scarce resource.

Re: select foreign characters (word 2003)

Fred, I agree with the idea of "pre-screening" ranges of text, and in the past when we had threads about finding, for example, some kind of errant formatting, I think we had code that did that. I'm not sure it's feasible in this case, but I haven't really given it a lot of thought.

> CPU time was a scarce resource

Not to mention the time waiting for the results back from school district headquarters while they ran your punch cards. <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15> (Back in my Fortran days.)

Re: select foreign characters (word 2003)

Jefferson,

I was looking in my VB/VBA ref book and the closest I got to being able to do this was the Filter BIF but that's only in VB, not VBA. Not sure it's doable in VBA. But if filter (or the equivalent of PL/1 AND) is available, then I think it's doable. I did note that VBA allows 2 numbers to be ANDed or ORed together, the result being the bit-wise AND or OR. So 6 AND 2 results in 2. But to make it superfast, you want to do an AND over the entire array at once and not iterate thru a do-loop.

Of course, Fortran had no native string manipulation functions at all so you're "forgiven". <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15> My very first professional assignment was to do a big computer system for tracking problems in telephone central office equipment based on collected data. Everyone wanted to do it in Fortran bcs that's what they had used on the previous project. I said let's do PL/1 and we did. But we used punch cards too. (They got back at me bcs the next project was done in Fortran but we used a lot of text-manip libraries.)