kernel_mutex problem cont. Or triple your throughput

First, I may have an explanation why the performance degrades to significantly and why innodb_sync_spin_loops may fix it. Second, if that is correct ( or not, but we can try anyway), than playing with innodb_thread_concurrency also may help. So I ran some benchmarks with innodb_thread_concurrency.

My explanation on the performance degradation is following: InnoDB still uses some strange mutex implementation, based on sync_arrays (hello 1990ies), I do not have a good reason why it is not yet replaced. Sync_array internally uses pthread_cond_wait / pthread_cond_broadcast construction, and on pthread_cond_broadcast call, all threads, competing on mutex, wake up and start racing. This effect has name thundering herd.

Davi Arnaut does not agree with me, where I do not agree with him either. This is the healthy discussion, and it is possible only because InnoDB is still Open Source and we all can check source code. If the problem were in the closed extension Thread Pool I could not participate in it.

We will probably argue more on that topic, but that does not stop us from trying differentinnodb_thread_concurrency ( 0 by default, that is no restrictions).

Related

Author

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks.Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products.He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition.

Share this post

Comments (12)

I think the problem is about to be correctly identified. Maybe it is not kernel_mutex that hurts Innodb. Maybe it is sync_array (protected by own lock) that hurts. All that stuff, the atomic looping with dirty read on the variable, the ut_delay with its fake math just not to access the variable, avoiding entering sync_array mutex as long as possible, pthread conditions and wakeups. At some point, all this needs to be replaced with a single pthread_mutex_timedlock() – (I believe timed is needed to handle deadlocks) on systems that support timed mutex locks, and ported to systems that do not support it.

innodb_thread_concurrency=8 is my favorite way to guarantee that you don’t get more than 8 pending disk operations (ignoring purge, ibuf merges and readahead). I know you aren’t promoting it as a great solution because for workloads that want to do a lot of disk IO and busy nice storage subsystems, it really is a good idea to send more concurrent operations to the disks.

Yeah. innodb_thread_concurrency is especially hard to tune on mixed workload. When you have completely CPU bound load some of the time so you want it relatively low and when there are heavy batch jobs which are IO bound and would benefit from a lot higher innodb_thread_concurrency.

“right” solution would be to some form of IO aware thread scheduling for whole MySQL not just Innodb where you can schedule something else to run when thread is to be blocked on disk/network IO, locks etc.

Wlad – I think you are right about using timed mutexes to replace the sync array. With that the sync array won’t be needed as each waiting thread can do its own checks for “waiting too long” and “missed wakeup” after each timeout. We have done prototypes for this a couple of times and the results were usually good, but CPU/mutex bound workloads are not the common case for me, so I will wait for someone else to implement it for real.

Interesting to see it hitting optimum at 8 considering that the box has 24 logical threads (12 physical cores). What does this imply ? Is it hitting some software bottleneck (sync_array, mutex herding etc) or a hardware one — numa/cache contention etc. I don’t see hardware becoming a bottleneck considering it has both RAID and Fusion-io card, what I/O scheduler was used for this — default CFQ or deadline ? Also was the filesystem XFS ?

Regarding innodb_thread_concurrency being 0, I can see that there will be lot of thrashing/cpu-stealing etc going on, leading to reduced throughput.

We recently tried some of this tuning to get rid of some contention that we are having. sync_spin_loop changes made no difference, and decreasing innodb_thread_concurrency to 16 or under actually caused our site to crash. So obviously this stuff is work-load dependent.

I think the way we understand innodb_thread_concurrency is wrong. i admit, the results presented in this article are nice, but we should understand that innodb_thread_concurrency is not cpu nor disk I/O bound! As long as the threads are not in entered in execution pool they are not working at all, thus the cpu and disk I/O are not affected. This leads to the conclusion that innodb_thread_concurrency is setting up a stand-by pool before the thread gets access to execution. Therefore, buffered threads according innodb_thread_concurrency are not competing for mutexes. So, the only settings that affect execution pool and performance (in terms of cpu and disk I/O performance) are: – innodb_read_io_threads – innodb_write_io_threads – innodb_commit_concurrency – innodb_thread_sleep_delay – innodb_concurrency_tickets – innodb_sync_spin_loops – innodb_spin_wait_delay Having this in mind, and trying to find the most stable configuration for my workload (knowing statistically how many threads try to enter simultaneously the execution queue), I’ve come to conclusion that I have to basically tune: • how many threads gets to execution – with regards of cpu and disks number • how the threads goes to execution – by tunning innodb_sync_spin_loops, innodb_spin_wait_delay and innodb_concurrency_tickets According to mysql site, I have come to the conclusion that I need to give the threads the chance to wait more to be granted to the execution pool BEFORE entering the sleep state (which will put them in FIFO pool, but with performance decrease), by relaxing innodb_sync_spin_loops and innodb_spin_wait_delay! In fact, I have slightly increased innodb_sync_spin_loops to 80 (default 30!) and reducing innodb_spin_wait_delay to 5 (default 6!). i established a value for innodb_thread_concurrency pool to 32. The result is a major decrease for all mutex competitions, therefore I obtain an increase in stability. I have a great respect for Percona’s programmers, they are an inspiration for me), so I’d really appreciate their opinion, maybe conducting a series of tests with this “theory”. Keep up, Percona team, with this great blogging site!

I have to mention that my values for “how many threads gets to execution” were set by tunning: – innodb_read_io_threads = 2 #number of innodb_buffer_pool_instances – innodb_write_io_threads = 10 #8 cores + 2 disks – innodb_commit_concurrency = 2 #number of innodb_buffer_pool_instances i know that the number “2” may lead to bottle-necks, but in reality the stability is so great, the value of OS Waits is under 1e-4% and “Spin rounds per wait” are below 30 on all types of mutexes, far from the value of 80 allocated. All these gave a performance boost. I’d like con’s or pros’s. Thanks!