Thursday, August 28, 2014

The InnoDB Mutex, part 3

I repeated tests from part 1 and part 2 using a system that has an older/smaller CPU but more recent versions of glibc & Linux. In this case the CPU has dual sockets with 6 cores per socket and 24 vCPUs with HT enabled. The host uses Fedora, glibc 2.17 and Linux kernel 3.11.9-200. Tests were run up to 512 threads. There was an odd segfault at 1024 threads with the TTASFutexMutex that I chose not to debug. And tests were run for 1, 4 and 12 mutexes rather than 1, 4 and 16 because of the reduced vCPU count. The lock hold duration was ~6000 nsecs rather than ~4000 because of the different CPU.

My conclusions from this platform were less strong than from the previous tests.

The InnoDB syncarray mutex still has lousy behavior at high concurrency but it is less obvious here because I did not run a test for 1024 threads.

pthread default is usually better than pthread adaptive, but in at least one case pthread adaptive was much better

The new custom InnoDB mutex, TTASFutexMutex, was occasionally much worse than the alternatives. From looking at code in 5.7, it looks like the choice to use it is a compile time decision. If only to figure out the performance problems this choice should be a my.cnf option and it isn't clear to me that TTASFutexMutex is ready for prime time.

Compare it to the old-style InnoDB mutex, "inno syncarray" in the graph above:* they have similar busy-wait loops* inno syncarray suffers from broadcast on unlock, inno futex only waits one waiter

So inno futex should be strictly better than inno syncarry on the graphs from part2 and part3 (here), ignoring some variance that occurs, but it definitely is not and I don't think variance explains it.

nthr - number of user threadsnloops - number of loop iterations per user threadthinkd - number of iterations to do the work loop (ut_busy, which is ut_delay minus the pause instruction)spinr - number of rounds for the busy wait loopspind - number of loop iterations for ut_busyy - inno, futex, posixadapt, etcnmux - number of mutexes, threads evenly distributed across mutexesretry_after_reserve - number of times to try to get the lock after reserving sync array slot, 4 is what innodb usesmax_spinners - max number of concurrent spinners for the new mutex variations I added that can limit max spinning threads -- posixgnspin, posixlnspin

I've tried replacing the log_t::mutex with a POSIX mutex and also other hot mutexes in the code. It is trivial to do that with the new infrastructure. While they perform well in synthetic benchmarks. In practice I see higher contention and a bigger drop in QPS.

My request at this point is for performance of futex-mutex to be better than the sync-array mutex all of the time. Today it appears to be better sometimes and worse sometimes. I opened http://bugs.mysql.com/bug.php?id=73763 for this.