Re: (ITS#5926) slapd proxying AD with back-meta locks up

Thank you for looking into this. The new configuration does not rely
on loopbacks and instead uses back-glue. We are also running 2.4.14
with additional patches.
Cheers,
-Matt
On Mar 3, 2009, at 10:46 AM, Howard Chu wrote:
> mhardin@symas.com wrote:
>> Full_Name: Matthew Hardin
>> Version: 2.4.12
>> OS: Red Hat Enterprise Linux 4 i686
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (74.38.114.185)
>>
>>
>> Hi All,
>>
>> We are using a pair of OpenLDAP 2.4.12 servers with back-meta to
>> proxy an active
>> directory domain. The clients are all current versions of PADL's
>> nss_ldap
>> libraries.
>>
>> Every once in a while (sometimes twice a day, sometimes once every
>> two weeks)
>> one of the slapd servers will peg CPU use at 100% and stop
>> answering requests.
>> The only way to stop slapd is with a kill -9.
>>
>> There doesn't seem to be anything to explain the lockup or allow us
>> to reproduce
>> it. We are using redundant AD servers and they are not going
>> offline. A third
>> slapd server running as a test server using the same AD servers and
>> configured
>> identically but serving a much lighter nss_ldap load does not fail
>> at all. We
>> have ruled out hardware, OS, and connectivity as possible causes.
>>
>> We are unfortunately unable to attach gdb to the running processes,
>> as these are
>> production servers and need to be restarted immediately. Our
>> smaller test system
>> does not exhibit the same behavior, either. There is nothing
>> unusual in the
>> server logs, either. We do have core files generated from kill -6
>> commands, and
>> they are all eerily similar to the back-trace below in that they
>> have one or
>> more threads waiting for a search or a bind response from AD.
>>
>> I am also enclosing relevant portions of slapd.conf for these
>> systems. Please
>> let me know if any additional information would be useful.
>>
>> Thanks,
>>
>> -Matt
>>
>> -----
>>
>>
>> (gdb) thr apply all bt
>
>> Thread 1 (process 29769):
>> #0 0x005fa410 in __kernel_vsyscall ()
>> #1 0x004ddd10 in raise () from /lib/libc.so.6
>> #2 0x004df621 in abort () from /lib/libc.so.6
>> #3 0x004d715b in __assert_fail () from /lib/libc.so.6
>> #4 0x0806eec8 in slap_listener (sl=0x9583108)
>> at /home/build/sol-2_4_12-1-nonopt/sol24/ldap24/servers/slapd/
>> daemon.c:1803
>> #5 0x0806f643 in slap_listener_thread (ctx=0x4e92220, ptr=0x9583108)
>> at /home/build/sol-2_4_12-1-nonopt/sol24/ldap24/servers/slapd/
>> daemon.c:1997
>> #6 0x00a10783 in ldap_int_thread_pool_wrapper (xpool=0x959a010)
>> at /home/build/sol-2_4_12-1-nonopt/sol24/ldap24/libraries/
>> libldap_r/tpool.c:663
>> #7 0x0038a45b in start_thread () from /lib/libpthread.so.0
>> #8 0x00585c4e in clone () from /lib/libc.so.6
>> (gdb)
>
> It seems you sent the wrong backtrace; this one doesn't show any
> signs of looping or anything that would indicate heavy CPU usage. It
> shows an assert which would kill the process, leading to 0% CPU
> usage. This assert was most likely fixed in 2.4.14.
>
>> slapd.conf
>
>> #######################################################################
>> # bdb database definitions
>> #######################################################################
>> database bdb
>> suffix "ou=nisdata"
>
>> #######################################################################
>> # Definitions for proxy and cache to AD
>> #######################################################################
>> database meta
>> suffix "dc=my-customer,dc=com"
>
>> # The link to AD:
>> uri ldaps://ldap-prd-dc01.my-customer.com/dc=ad,dc=my-
>> customer,dc=com
>> ldaps://ldap-prd-dc02.my-customer.com/
>
>> # The link to the NIS data directory (yes, we could chain/glue,
>> that's
>> # for later)
>> uri ldapi://%2fvar%2fsymas%2frun%2fldapi/dc=nis,dc=my-
>> customer,dc=com
>
> Pointing back-meta at its own slapd will inevitably exhaust the
> thread pool since incoming operations will always use 2x the number
> of available threads.
>
> This ITS will be closed.
> --
> -- Howard Chu
> CTO, Symas Corp. http://www.symas.com
> Director, Highland Sun http://highlandsun.com/hyc/
> Chief Architect, OpenLDAP http://www.openldap.org/project/