Some rough benchmarks on a 1500 character (and 2900 byte) string (Linux, whatever processor is inside this Thinkpad T43 here, your mileage may vary, etc etc etc) give me about 61 scans/sec with PHP 5.2.1, where a “scan” is just moving through the loop above with mb_substr and doing one if() test comparing the char to ‘<’

Under PHP 6.0.0-dev with unicode.semantics=on, switching from mb_strlen() and mb_substr() to regular strlen() and substr() produces about the same result. And indexing with $theString[$k] is the same speed as substr().