In our example we were just indexing INT values (4 bytes in size). But I predict that it becomes even slower, when we index VARCHAR fields.

This situation will become even worse when you get pressure to your RAM or when the MyISAM key buffer will become too small and/or if you have pressure to your I/O system.

When you go to the limits of your hardware you should find a good balance between as many indexes as necessary but as little indexes as possible.

When you want to get rid of indexes the quick wins are indexes which are completely redundant. This happens when the people who create the indexes are not looking if such an index already exists.

An other quick win are indexes which are partially redundant (from left to right). With such indexes it does often not make sense to keep both of them (except you make use of covering indexes and explicitly specified them for this purpose).

It is a bit more tricky to find indexes which are never used. How this works out we will see in a later post.

Comments

Using MyISAM is just asking for poor performance. Try using a storage engine that is actually good at indexing.
For example, InnoDB offers a substantial improvement (perhaps a factor of 4) over MyISAM for maintaining indexes. InnoDB uses an change buffer to achieve that performance.
For really high DML operations, try TokuDB (tokutek.com), which is about ten times faster than InnoDB. TokuDB uses fractal tree indexes to achieve its performance.
Both InnoDB and TokuDB are transactional and can recover from crashes. MyISAM will corrupt your data on a crash.

Moin Bradley,
the post is about the increasing cost of maintaining (more) indexes.
Secondary indexes with InnoDB are even more expensive. Think about a UUID-PK and every secondary saved separately to the PK. So I doubt there is no increasing cost using InnoDB. (And only this would counter the essence of the blog.)
I don't know TokuDB, but as TokuDB is *not* GPL who cares?

You are correct that indexes cost something in InnoDB. However, if you measure InnoDB you will likely find it is much faster than MyISAM for insertions when maintaining an index on a large table. My measurements (and others, such as Mark Callaghan at Facebook) say that InnoDB is faster (maybe 4x) for this kind of load.
Also you are correct that some people will not consider using a non-GPL storage engine, and that's fine. That's part of why I am pointing out that InnoDB is far better suited than MyISAM for the load you measured.
Some people are willing to use a non-GPL storage engine if it offers some other advantage, however. In the case of TokuDB, one advantage is more than a 10x performance improvement for this kind of load. (There are other advantages such as compression.)
I read your home page, which says you offer "vendor neutral" consulting, and that you are in the business of recommending the best solution to meet your clients needs. I hope that there are some situations where TokuDB would be the best solution to meet your clients needs, and if you find such a client, I hope we can work together.
-Bradley

Let me summarize something.
You don't disagree the cost of maintaining indexes as this blog describes.
You are not disagreeing there are "some" costs for innodb either.
So there is a flame MyIsam vs. InnoDb left. Which is imho not intended (at least from the author).
Btw: Im not working for fromdual, so I don't give a ***** what they do. If you want to read *my* homepage just click on my name;)

I didn't mean to be flaming, I'm sorry if it seemed that way. And I apologize for not noticing that the response to my comment wasn't from the original author.
I guess I got a different message from the original posting. The original posting observed that indexes cost something. But what's the point of that observation? The point is to do something to your database to fix the slowness. The original posting implied that dropping some indexes would be one fix to the problem. The chart says fewer indexes means higher performance. So far, so good.
However, there are other things you could do. For example, switch to InnoDB or TokuDB. (There may be other approaches such as using some NoSQL solution, but let's put those aside for now.)
I guess I misread the original posting. Upon re-reading, it seems like the data in the original posting doesn't look so bad. After all, inserting 500K rows in 3 minutes instead of 90 seconds to get 5 indexes doesn't seem so bad. Sure, indexes cost *something*, but it's not much on this measurement. (In fact, for such a small database, perhaps MyISAM is faster than InnoDB. But the problems I've seen don't become difficult until the databases get larger.)
Here's a much more interesting experiment (at least more interesting to me): What happens if
a) the tables are bigger than main memory, and
b) the indexes are random (or nearly random) values (things like UUID-PKs).
Check out Mark Callaghan's postings on his MySQL at Facebook page to see the advantage that InnoDB offers. (And check out the graphs on Tokutek.com to see the additional advantages that TokuDB offers. Or don't if you really don't want to know about a non-GPL solution.)
I guess my point is that if you are finding that the cost of maintaining indexes is slowing down your DML, dropping the indexes isn't your only option. You may like the indexes because they speed up queries. Given a fixed hardware budget and a fixed insertion rate, you can keep many more indexes than with MyISAM. I want people to be able to use indexes without worrying so much about the cost of maintaining them.

It seems indeed, that InnoDB has a less linear overhead than MyISAM (at least in my test) as long as the data can be kept in Memory. An thus it seems to perform slightly better in the described scenario when we have more than 1 to 2 indexes.
Do not get me wrong, I do NOT promote MyISAM here I just want to find the truth.
Some forces (good or bad ones?) hindered me to download Tokutek. I take it as a sign to stay away and stay with clean open source.

Navigation

Trademarks

RSSLinkedInTwitterXingFacebookFlickrGoogle+Follow @fromdual
MySQL, InnoDB and Oracle are registered trademarks of Oracle Corp. MariaDB is a trademark of Monty Program AB.
All other trademarks are property of their respective owners. Other product or company names mentioned may be trademarks or trade names of their respective owner.