Thursday, January 12, 2017

increasing latency using Google RE2 regex library

Recently after building a text processing service using Google RE2 regex library which is assumed to be thread friendly, we benched it with multiple threads which share the same regex instances. We are surprised that the latency of our service is increasing as we bench the service again and again. Initially we could get something like 200 microsecond latency, but later we got 200 millisecond latency which is 1000 times slower.

According to Reference [1], RE2 uses a fixed amount of memory cache for each regex instance, and if we run out of the cache memory. We searched online to find RE2 documents about the memory cache, but found nothing about it. We finally had to look into the source codes of RE2, and found that there is a class re2::RE2::Options which could take a parameter called max_mem. The Options could be used to initialize an RE2 regex instance.
Moreover, inside RE2, there are two methods to do regex matching: one is DFA which is much faster than the other one NFA. However, DFA needs more memory and could run out of memory. After DFA runs out of memory for many times, RE2 would use NFA instead, which becomes much slower, as shown in the RE2 code comments (budget here means its configured memory):

Once a DFA fills its budget, it flushes its cache and starts over.If this happens too often, RE2 falls back on the NFA implementation.

For more details about the max_mem option, please refer to Reference [2].

RE2's NFA interface could be found here, while its DFA interface is here.

References:

// The max_mem option controls how much memory can be used
// to hold the compiled form of the regexp (the Prog) and
// its cached DFA graphs. Code Search placed limits on the number
// of Prog instructions and DFA states: 10,000 for both.
// In RE2, those limits would translate to about 240 KB per Prog
// and perhaps 2.5 MB per DFA (DFA state sizes vary by regexp; RE2 does a
// better job of keeping them small than Code Search did).
// Each RE2 has two Progs (one forward, one reverse), and each Prog
// can have two DFAs (one first match, one longest match).
// That makes 4 DFAs:
//
// forward, first-match - used for UNANCHORED or ANCHOR_LEFT searches
// if opt.longest_match() == false
// forward, longest-match - used for all ANCHOR_BOTH searches,
// and the other two kinds if
// opt.longest_match() == true
// reverse, first-match - never used
// reverse, longest-match - used as second phase for unanchored searches
//
// The RE2 memory budget is statically divided between the two
// Progs and then the DFAs: two thirds to the forward Prog
// and one third to the reverse Prog. The forward Prog gives half
// of what it has left over to each of its DFAs. The reverse Prog
// gives it all to its longest-match DFA.
//
// Once a DFA fills its budget, it flushes its cache and starts over.
// If this happens too often, RE2 falls back on the NFA implementation.