Details

Description

PooledSegmentReader pools the underlying byte arrays of deleted docs and norms for realtime search. It is designed for use with IndexReader.clone which can create many copies of byte arrays, which are of the same length for a given segment. When pooled they can be reused which could save on memory.

Do we want to benchmark the memory usage comparison of PooledSegmentReader vs GC? Many times GC is enough for these smaller objects.

True the pool would hold onto spares, but they would expire.
It's mostly useful for the large on disk segments as those byte
arrays (for BitVectors) are large, and because there's more docs
in them would get hit with deletes more often, and so they'd be
reused fairly often.

I'm not knowledgeable enough to say whether the transactional
data structure will be fast enough. We had been usinghttp://fastutil.dsi.unimi.it/docs/it/unimi/dsi/fastutil/ints/IntR
BTreeSet.html in Zoie for deleted docs and it's way slow. Binary
search of an int array is faster, albeit not fast enough. The
multi dimensional array thing isn't fast enough (for searching)
as we implemented this in Bobo. It's implemented in Bobo because
we have a multi value field cache (which is quite large because
for each doc we're storing potentially 64 or more values in an
inplace bitset) and a single massive array kills the GC. In some
cases this is faster than a single large array because of the
way Java (or the OS?) transfers memory around through the CPU
cache.

Jason Rutherglen
added a comment - 02/Apr/09 21:52 True the pool would hold onto spares, but they would expire.
It's mostly useful for the large on disk segments as those byte
arrays (for BitVectors) are large, and because there's more docs
in them would get hit with deletes more often, and so they'd be
reused fairly often.
I'm not knowledgeable enough to say whether the transactional
data structure will be fast enough. We had been using
http://fastutil.dsi.unimi.it/docs/it/unimi/dsi/fastutil/ints/IntR
BTreeSet.html in Zoie for deleted docs and it's way slow. Binary
search of an int array is faster, albeit not fast enough. The
multi dimensional array thing isn't fast enough (for searching)
as we implemented this in Bobo. It's implemented in Bobo because
we have a multi value field cache (which is quite large because
for each doc we're storing potentially 64 or more values in an
inplace bitset) and a single massive array kills the GC. In some
cases this is faster than a single large array because of the
way Java (or the OS?) transfers memory around through the CPU
cache.

Jason Rutherglen
added a comment - 11/Nov/09 19:02 A likely optimization for this patch is we'll only pool if the doc count is above a threshold, 100,000 seems like a good number. Also pooling will be optional.

I'm going to revive this, and if it works fine for trunk, then we can use the basic system for RT eg, LUCENE-2312. I think the only open question is how we'll shrink the pool, most likely there'd be an expiration on the pooled objects. With RT, the parallel arrays will grow, so the pool will need to be size based, eg, when the arrays are grown, all of the previous arrays may be forcefully evicted, or they may simply expire.

Jason Rutherglen
added a comment - 21/Jan/11 23:24 I'm going to revive this, and if it works fine for trunk, then we can use the basic system for RT eg, LUCENE-2312 . I think the only open question is how we'll shrink the pool, most likely there'd be an expiration on the pooled objects. With RT, the parallel arrays will grow, so the pool will need to be size based, eg, when the arrays are grown, all of the previous arrays may be forcefully evicted, or they may simply expire.

Michael McCandless
added a comment - 22/Jan/11 11:16 I'm working on an initial patch for this...
I think the only open question is how we'll shrink the pool, most likely there'd be an expiration on the pooled objects.
I think we can simply have a "max pooled free bit vectors"... or we may want to expire by time/staleness as well.
With RT, the parallel arrays will grow, so the pool will need to be size based, eg, when the arrays are grown, all of the previous arrays may be forcefully evicted, or they may simply expire.
True... but, like the other per-doc arrays, the BV can be overallocated (ArrayUtil.oversize) to accommodate further added docs.

ThreadPoolExecutor can act as a guide, it's main parameters are corePoolSize, maximumPoolSize, keepAliveTime.

In regards to using System.arraycopy for the RT parallel arrays, if they grow to become too large, then SC could become a predominant cost. However if the default thread states is 8, which'd mean that many DWPTs, the arrays would probably never grow to be too large for their SC to become noticeably expensive, hopefully.

Jason Rutherglen
added a comment - 22/Jan/11 16:18 ThreadPoolExecutor can act as a guide, it's main parameters are corePoolSize, maximumPoolSize, keepAliveTime.
In regards to using System.arraycopy for the RT parallel arrays, if they grow to become too large, then SC could become a predominant cost. However if the default thread states is 8, which'd mean that many DWPTs, the arrays would probably never grow to be too large for their SC to become noticeably expensive, hopefully.

We want to record the deletes between getReader calls however there's no way to know ahead of time if a term or query is going to hit many documents or not, meaning we can't always save del docids, because we'd have too many ints in RAM. I'm curious how we plan on handling this case?

Jason Rutherglen
added a comment - 23/Jan/11 20:51 We want to record the deletes between getReader calls however there's no way to know ahead of time if a term or query is going to hit many documents or not, meaning we can't always save del docids, because we'd have too many ints in RAM. I'm curious how we plan on handling this case?

Michael McCandless
added a comment - 24/Jan/11 18:28 I'm curious how we plan on handling this case?
I think we should keep the replay log smallish, or, expire it aggressively with age. I suspect this opto is only going to be "worth it" for very frequent reopens... but I'm not sure yet.

And, I haven't yet seen that this is in fact worthwhile. The rough benchmark I have (which hits other issues so the results aren't conclusive yet) doesn't show much difference w/ this patch. I think this patch may only be worthwhile at insane reopen rates, which I think in practice is rarely a legitimate use case (even though many apps start off thinking it is).

Michael McCandless
added a comment - 24/Jan/11 20:37 Attached rough patch.
At least one test fails....
And, I haven't yet seen that this is in fact worthwhile. The rough benchmark I have (which hits other issues so the results aren't conclusive yet) doesn't show much difference w/ this patch. I think this patch may only be worthwhile at insane reopen rates, which I think in practice is rarely a legitimate use case (even though many apps start off thinking it is).

Jason Rutherglen
added a comment - 25/Jan/11 19:33 I think likely alloc of big BitVector, System.arraycopy, destroying it, may be a fairly low cost compared to lucene resolving the deleted term, indexing the doc, flushing the tiny segment, etc.
Right, and if we pool the byte[]s we'd take the cost of instantiating and GC'ing out of the picture in the high NRT throughput case. It's counter intuitive and will require testing.