> Without details this cannot be investigated further. All testing I
> have done shows it giving
> character distance for utf8 just fine.

We start to use your module with a lot of utf-8 data from a database and
found some problems.
We add some test cases that demonstrate the problems and patch the
package, so it works in our environment.
Thanks for your work, it gives us a 20 times speedup to the original
used function.

> > Without details this cannot be investigated further. All testing I
> > have done shows it giving
> > character distance for utf8 just fine.

>
> We start to use your module with a lot of utf-8 data from a database and
> found some problems.
>
> We add some test cases that demonstrate the problems and patch the
> package, so it works in our environment.
>
> Thanks for your work, it gives us a 20 times speedup to the original
> used function.
>
>

After a second lock to the source, I find some optimisation points that
make the performance of the UTF8 version comparable to the original version.
I attach a second patch for the original code (version 0.03). I do some
performance analysis with the performance.pl script, thats in the patch.

> I attach a second patch for the original code (version 0.03). I do some
> performance analysis with the performance.pl script, thats in the patch.

Patched version fails e.g. for:
distance(pack('U2',252,65),pack('U2',252,66));
... returns 0, but ought to return 1 due to the final substitution
'A'->'B' (chr(65)->chr(66)). My guess is a bug in the length checks.

Was there a reason to delete from CPAN? Perhaps Joshua Goldberg would
consider allowing a co-maintainer? I'm happy to take responsibility
for the module if he doesn't want to.
On 9 January 2016 at 07:09, Slaven_Rezic via RT
<bug-Text-LevenshteinXS@rt.cpan.org> wrote:
Show quoted text