Hi!
Jerry Schwartz wrote:
> I have a pretty simple query that seems to take a lot longer than it ought to
> (over 2 minutes).
>
> [[...]]
>
> SELECT
> feed_new.new_title AS `New Title FROM Feed`,
> prod.prod_pub_prod_id AS `Lib Code FROM DB`,
> prod.prod_title AS `Title FROM DB`,
> prod.prod_num AS `Prod Num`,
> prod.prod_published AS `Published FROM DB`
> FROM feed_new JOIN prod
> ON LEFT(feed_new.new_title, LENGTH(feed_new.new_title) - 5) =
> LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5)
> WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
> ORDER BY feed_new.new_title;
>
> [[...]]
>
> The query is doing a scan of the 9816 records that have pub_id = @PUBID, but
> even so this seems like a long time. Are the built-in string functions really
> that slow?
The general rule is:
If you are not taking the value of a column directly but are applying
any function to it (like "LEFT" in your statement), an index cannot be
used. So the table needs to be accessed (scanned) and the function
computed on each row, to evaluate the predicate (the comparison).
I am no expert in checking "explain" output, so I may well be wrong in
my guess:
I think the execution will scan the whole "feed_new" table (895 records)
for each of those 9816 matches of pub_id, so it is doing 8785320 calls
of "LEFT()" followed by a string comparison.
>
> I suspect it would be faster if I built separate tables that had just the
> shortened versions of the titles, but I wouldn't think that would be
> necessary.
So IMO you have two choices:
- Either you accept the performance implications of a table scan
(which will of course get worse when your data grows),
- or you introduce another column in your table in which you store the
function result (maintained on INSERT and UPDATE) and create an index
on this column.
An additional table with the shortened columns is no good idea IMO,
because you would need to maintain it in sync with your "real" data.
In your example, it should be sufficient to add the new column to table
"feed", because your execution strategy should start by evaluating
prod.pub_id = @PUBID
HTH,
Jörg
--
Joerg Bruehe, MySQL Build Team, Joerg.Bruehe@stripped
Sun Microsystems GmbH, Komturstrasse 18a, D-12099 Berlin
Geschaeftsfuehrer: Juergen Kunz
Amtsgericht Muenchen: HRB161028

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.