Tuesday, December 9, 2008

How to tell when a performance project succeeds?

The Volo project is an effort to improve the interface between sockets and any socket-driven subsystems, including the TCP/IP stack. During their testing, they panicked during some IPsec tests. See this bug for what they reported.

In our IPsec, we have LARVAL IPsec security associations (SAs). These are SAs that reserve a unique 32-bit Security Parameters Index (SPI), but have no other data. If a packet arrives for a LARVAL SA, we queue it up, so that when it gets filled in by key management, the packet can go through. We do this because of the IKE Quick Mode Exchange, which looks like this:

INITIATOR RESPONDER --------- ---------

IKE Quick Mode Packet #1 ------->

IKE Quick Mode Packet #3 -------->

Now once the initiator receives Quick Mode packet #2, it has enough information to transmit an IPsec-protected packet. Unfortunately, the responder cannot finish completing its Security Association entries until it receives packet #3. It is possible, then, that the initiator's IPsec packet may arrive before the responder has finished processing IKE. Let's look at the packets again:

INITIATOR RESPONDER --------- ---------

IKE Quick Mode Packet #1 ------->

ESP or AH packet ----------> Does this packet...

IKE Quick Mode Packet #3 --------> ... get processed after my receipt of #3, also after which I SADB_UPDATE my inbound SA, which changes it from LARVAL to MATURE?

Now the code that queues up an inbound IPsec packet for a LARVAL SA is sadb_set_lpkt(), as was shown in the bug's description. It turns out there was a locking bug in this function - and we even had an ASSERT()-ion that the SA manipulated by sadb_set_lpkt() was always larval. The problem was, we discounted the possibility of IKE finishing between the detection of a LARVAL SA and the actual call to sadb_set_lpkt().

The Volo project improved UDP latency enough so that the IKE packet wormed its way up the stack and into in.iked faster than the concurrent ESP or AH packet. The aformentioned ASSERT() tripped during Volo testing, because we did not check the SA's state while holding its lock. Had we, we could tell that the LARVAL SA was promoted to ACTIVE, and we could go ahead and process the packet.

This race condition was present since sadb_set_lpkt() was introduced in Solaris 9, but it took Volo's improved performance to find it. So hats off to Volo for speeding things up enough to find long-dormant race conditions!

Addendum - IKEv2 does not have this problem because its equivalent to v1's Quick Mode is a simpler request/response exchange, so the responder is ready to receive when it sends the response back to the initiator.