David,
> I have an upgrade script for the forthcoming version of Bricolage. It
> has a function that looks like this:
So your purpose is to eliminate all non-[0-9A-Za-z_] characters?
> Now, this is all well and good, except that it takes _forever_ to run!
> I have it running now on a database with two tables to be updated, with
> a total of 2213 rows, and it has been running for about 30 mins. And
> yes, it's a fast server with a reasonably tuned database.
That's rather odd ... I've done search-and-replace using Perl on 30,000 large
text fields in a couple minutes. Does this field have foriegn keys and/or
several indexes on it?
> Now, I could speed this up if it could just run in a single query,
> rather than with a big select and 2213 UPDATEs. The trouble is, I don't
> know of an SQL equivalent to y/a-z0-9/_/cs, which translates any non
> alphanumeric characters to underscores. I see that there's a
> substring() function that supports POSIX regular expressions, but that
> doesn't do a substitution.
Well, the new substring can do substitutions, but not neatly in the s//g
format the way Perl can. What about PL/perl?
> So my question is, does anyone know of a way I could do this in the
> database (and it is a one-time change [I hope!])? And if there is one,
> will it be faster?
Not necessarily. I think your execution time problem is not releated to the
queries themselves, but to some major database dependency.
Wanna show me, since I'll soon be getting paid to tune Bric anyway?
--
Josh Berkus
Aglio Database Solutions
San Francisco