Thursday, March 22, 2007

A few days ago, after a minor bug fix in our site code, suddenly, the load on the database server dropped about 50%. I was obviously interested in what caused such a major improvement and found out a few interesting things. To demonstrate, this fascinating phenomenon, lets create a database:

Do you see the difference? Probably not. But it just made the query 16 times slower! I spent a long time finding that one (and in our case I was looking at endless lists of zip codes). Let me illustrate.

Right. Type casting. In the second query I unquoted one of the values in the list in the list. Namely the value "160". Now to the point. I was a bit frustrated and I couldn't believe that a cast of an INT to a CHAR would take so long and it doesn't. A simple explain shows what is wrong. We are not using indexes properly.

As you can see in the first case (with a type cast) we look through all the 100K values, where in the second case (with quoting fixed) we only look through 116. That is where the performance goes. But why? I asked some people from MySQL AB (good thing we have a support contract). And the answer I got was that it is difficult to use indexes on CHAR fields after a cast from INT, since you can comapre a CHAR to an INT in more than one way. Wow! A quick check in MySQL confirmed my nightmares.

Whaaa!!! So that explains why MySQL finds it "difficult" to use indexes after a type cast. The question that remains is what MySQL team was smoking when they implemented string comparison in this creative new way.