We're in a position right now, where we have a change list (like a transactionlog) and we'd like to index the changes by author, but a typical query is:

Show the last n changes for author "Foo Bar"

or

Show changes after Jan. 1st, 2012 for author "Foo Bar"

Certainly, we can denormalize our data to facilitate this lookup. But the ideaof using intersecting iterators seems intriguing (to get a modicum ofdata-local server-side joining), but our ideas for shoe-horning the query intointersecting iterators seems really wonky or half-baked. Largely, we'rerunning into the restriction that intersecting iterators are based upon theproduct of a boolean conjunctive statements about term equality. What we'dreally like is a little more range-based. The Accumulo documentation alludesto the problem a little:

If the results are unordered this is quite effective as the first results to arrive are as good as any others to the user.

In our case, order matters because we want the last results without pulling ineverything.

We looked at the code for intersecting iterators a little, and noticed thatthere's an inheritance design, but we're not convinced that it's really"designed for extension" and if it is, we're not sure if it can be extended tomeet our needs gracefully. If it can, we're really interested in anysuggestions or prior work.

Otherwise, we're open to the idea that there's Accumulo features we're just notaware of beyond intersecting iterators that are a better fit.

It would be wonderful to have a technique to hedge against over-denormalizingour data for every variant of query we have to support.

You're pretty much on the spot regarding two aspects about the currentIntersectingIterator:

1- It's not really extensible (there are hooks for building doc IDs,but you still need the same `partition term: docId` key structure)2- Its main strength is that it can do the merges of sorted lists ofdoc IDs based on equality expressions (ie, `author=="bob" andday=="20120627"`)

Fortunately, the logic isn't very complicated for re-creating themerging stuff. Personally, I think it's easy enough to separate thelogic of joining N streams of iterator results from the actualscanning. Unfortunately, this would be left up to you to do at themoment :)

You could do range searches by consuming sets of values and sortingall of the docIds in that range by throwing them into a TreeSet. Thatwould let you emit doc IDs in a globally sorted order for the givenrange of terms. This can get problematic if the range ends up beingvery large because your iterator stack may periodically be destroyedand rebuilt.

On Thu, Jun 28, 2012 at 1:49 PM, Sukant Hajra <[EMAIL PROTECTED]> wrote:> We're in a position right now, where we have a change list (like a transaction> log) and we'd like to index the changes by author, but a typical query is:>> Show the last n changes for author "Foo Bar">> or>> Show changes after Jan. 1st, 2012 for author "Foo Bar">> Certainly, we can denormalize our data to facilitate this lookup. But the idea> of using intersecting iterators seems intriguing (to get a modicum of> data-local server-side joining), but our ideas for shoe-horning the query into> intersecting iterators seems really wonky or half-baked. Largely, we're> running into the restriction that intersecting iterators are based upon the> product of a boolean conjunctive statements about term equality. What we'd> really like is a little more range-based. The Accumulo documentation alludes> to the problem a little:>> If the results are unordered this is quite effective as the first results> to arrive are as good as any others to the user.>> In our case, order matters because we want the last results without pulling in> everything.>> We looked at the code for intersecting iterators a little, and noticed that> there's an inheritance design, but we're not convinced that it's really> "designed for extension" and if it is, we're not sure if it can be extended to> meet our needs gracefully. If it can, we're really interested in any> suggestions or prior work.>> Otherwise, we're open to the idea that there's Accumulo features we're just not> aware of beyond intersecting iterators that are a better fit.>> It would be wonderful to have a technique to hedge against over-denormalizing> our data for every variant of query we have to support.>> Thanks for your help,> Sukant

Excerpts from William Slacum's message of Thu Jun 28 16:04:32 -0500 2012:>> You're pretty much on the spot regarding two aspects about the current> IntersectingIterator:> > 1- It's not really extensible (there are hooks for building doc IDs,> but you still need the same `partition term: docId` key structure)> 2- Its main strength is that it can do the merges of sorted lists of> doc IDs based on equality expressions (ie, `author=="bob" and> day=="20120627"`)> > Fortunately, the logic isn't very complicated for re-creating the> merging stuff. Personally, I think it's easy enough to separate the> logic of joining N streams of iterator results from the actual> scanning. Unfortunately, this would be left up to you to do at the> moment :)> > You could do range searches by consuming sets of values and sorting> all of the docIds in that range by throwing them into a TreeSet. That> would let you emit doc IDs in a globally sorted order for the given> range of terms.

I understand everything above, I think. Thanks for the prompt reply.

> This can get problematic if the range ends up being very large because your> iterator stack may periodically be destroyed and rebuilt.

This particular statement confused me. When you said TreeSet, you're talkingabout a straight-forward in-memory collection from java.util or similar, right?

Because I'm confused about which "iterator stack may periodically be destroyedand rebuilt." It sounds like we're talking about some garbage collectionspecific to Accumulo. Am I missing something here?

By iterator stack I am referring to the Accumulo iterators. Resourcesharing among scan sessions is implemented by destroying a user scansession and eventually recreating the iterator stack. The new stack is thenseek'd to the last key returned by the entire stack. If you were holdingsome state, such as a set of keys, it would be rebuilt every time the stackis created.On Jul 1, 2012 5:55 PM, "Sukant Hajra" <[EMAIL PROTECTED]> wrote:

> Excerpts from William Slacum's message of Thu Jun 28 16:04:32 -0500 2012:> >> > You're pretty much on the spot regarding two aspects about the current> > IntersectingIterator:> >> > 1- It's not really extensible (there are hooks for building doc IDs,> > but you still need the same `partition term: docId` key structure)> > 2- Its main strength is that it can do the merges of sorted lists of> > doc IDs based on equality expressions (ie, `author=="bob" and> > day=="20120627"`)> >> > Fortunately, the logic isn't very complicated for re-creating the> > merging stuff. Personally, I think it's easy enough to separate the> > logic of joining N streams of iterator results from the actual> > scanning. Unfortunately, this would be left up to you to do at the> > moment :)> >> > You could do range searches by consuming sets of values and sorting> > all of the docIds in that range by throwing them into a TreeSet. That> > would let you emit doc IDs in a globally sorted order for the given> > range of terms.>> I understand everything above, I think. Thanks for the prompt reply.>> > This can get problematic if the range ends up being very large because> your> > iterator stack may periodically be destroyed and rebuilt.>> This particular statement confused me. When you said TreeSet, you're> talking> about a straight-forward in-memory collection from java.util or similar,> right?>> Because I'm confused about which "iterator stack may periodically be> destroyed> and rebuilt." It sounds like we're talking about some garbage collection> specific to Accumulo. Am I missing something here?>> -Sukant>

Since I had started a response, but Bill beat me to it, let me reiterate.

The tear-down is more for assuring responsiveness when multiple scans are happening at one time. There's a buffer between TabletServer(s) and the client which (if memory serves) it's filled, the scan session is a candidate to be torn down, and later recreated.

To avoid duplicate work by your Accumulo iterators, the last key the iterators returned is maintained by Accumulo.

For example, if you started a scan with a Range:

(-inf, +inf)

Say you scanned 2000/10000 keys in a table of monotonically increasing Keys where only the row is populated. The buffer was filled, the iterators torn down, and re-created some amount of time later. Instead of getting the (-inf, +inf) range again, you would then get the range:

(2000, +inf)

Meaning, the initial infinite start key would be replaced with a start key which was the last key your previous scan returned, non-inclusive.

In short, it's good practice to try to keep Accumulo iterators from holding on to state in memory, otherwise you may get stuck creating the same in-memory members on your iterators repeatedly. See ACCUMULO-625 for some thoughts about trying to avoid this lost-state issue.

- Josh

On 07/01/2012 05:18 PM, William Slacum wrote:>> By iterator stack I am referring to the Accumulo iterators. Resource > sharing among scan sessions is implemented by destroying a user scan > session and eventually recreating the iterator stack. The new stack is > then seek'd to the last key returned by the entire stack. If you were > holding some state, such as a set of keys, it would be rebuilt every > time the stack is created.>> On Jul 1, 2012 5:55 PM, "Sukant Hajra" <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote:>> Excerpts from William Slacum's message of Thu Jun 28 16:04:32> -0500 2012:> >> > You're pretty much on the spot regarding two aspects about the> current> > IntersectingIterator:> >> > 1- It's not really extensible (there are hooks for building doc IDs,> > but you still need the same `partition term: docId` key structure)> > 2- Its main strength is that it can do the merges of sorted lists of> > doc IDs based on equality expressions (ie, `author=="bob" and> > day=="20120627"`)> >> > Fortunately, the logic isn't very complicated for re-creating the> > merging stuff. Personally, I think it's easy enough to separate the> > logic of joining N streams of iterator results from the actual> > scanning. Unfortunately, this would be left up to you to do at the> > moment :)> >> > You could do range searches by consuming sets of values and sorting> > all of the docIds in that range by throwing them into a TreeSet.> That> > would let you emit doc IDs in a globally sorted order for the given> > range of terms.>> I understand everything above, I think. Thanks for the prompt reply.>> > This can get problematic if the range ends up being very large> because your> > iterator stack may periodically be destroyed and rebuilt.>> This particular statement confused me. When you said TreeSet,> you're talking> about a straight-forward in-memory collection from java.util or> similar, right?>> Because I'm confused about which "iterator stack may periodically> be destroyed> and rebuilt." It sounds like we're talking about some garbage> collection> specific to Accumulo. Am I missing something here?>> -Sukant>

Excerpts from Sukant Hajra's message of Thu Jun 28 15:49:11 -0500 2012:>> The Accumulo documentation alludes to the problem a little:> > If the results are unordered this is quite effective as the first results> to arrive are as good as any others to the user.> > In our case, order matters because we want the last results without pulling in> everything.

Actually, I was just thinking about this a little. I don't know if this isspecified in the documentation, but is there /any/ reliable (deterministic)ordering for the values returned by intersecting iterators?

If there is, would it be horribly ill-advised to rely on this ordering forapplication logic if we got clever with our schema?

Also, if someone could reply with the exact algorithm for this ordering, itwould help put less burden on us to reverse engineer and/or read the sourcecode correctly.

If I wanted to query for all documents that contained "Bill", "Josh" and"Sukant", I'd set up an IntersectingIterator that three term sources. Theterm sources would be created to look at one of {"Bill", "Josh", "Sukant"}for their column family values. The column qualifiers contained documentIDs for documents that contain the given term. This yields a set up where,for a given term, we have a sorted list of document IDs.

To give a bit of a visualization, you can think of this structure in treeform: Intersection / | \ "Sukant" "Bill" "Josh" [0, [1, [3, 3, 2, 4, 6] 3] 5]On our first pass, the IntersectingIterator will note that its childrenpoint to the document IDs 0, 1 and 3. Since each list of doc IDs is sorted,we can deduce that the earliest doc ID that could be a potential match is3. So, it will seek the term sources for "Sukant" and "Bill" to at leastthe key {row: "partitionN", colf: <term>, colq: "3"}. On the next pass,we'll note that each term source is pointing to doc ID 3. This means we'vefound an intersection, so the top level IntersectingIterator will returndocID 3.

When the session requests the next matching docID, the iterator willadvance each iterator by calling next(). The IntersectingIterator now seesits children are all positioned at docIDs [6, null, 5] (the `null` valuearises because the "Bill" term source doesn't have a key beyond {row:"partitionN", colf: "Bill", colq: "3"}. This state means that theintersection is done, because one of the term sources has exhausted itspossible values, so there's no doc ID that will occur in all three lists.

> Excerpts from Sukant Hajra's message of Thu Jun 28 15:49:11 -0500 2012:> >> > The Accumulo documentation alludes to the problem a little:> >> > If the results are unordered this is quite effective as the first> results> > to arrive are as good as any others to the user.> >> > In our case, order matters because we want the last results without> pulling in> > everything.>> Actually, I was just thinking about this a little. I don't know if this is> specified in the documentation, but is there /any/ reliable (deterministic)> ordering for the values returned by intersecting iterators?>> If there is, would it be horribly ill-advised to rely on this ordering for> application logic if we got clever with our schema?>> Also, if someone could reply with the exact algorithm for this ordering, it> would help put less burden on us to reverse engineer and/or read the source> code correctly.>> Thanks for your help,> Sukant>

Excerpts from William Slacum's message of Sun Jul 01 23:23:50 -0500 2012:>> To give a bit of a visualization, you can think of this structure in tree> form:> > > Intersection> / | \> "Sukant" "Bill" "Josh"> [0, [1, [3,> 3, 2, 4,> 6] 3] 5]> > > On our first pass, the IntersectingIterator will note that its children> point to the document IDs 0, 1 and 3. Since each list of doc IDs is sorted,> we can deduce that the earliest doc ID that could be a potential match is> 3. So, it will seek the term sources for "Sukant" and "Bill" to at least> the key {row: "partitionN", colf: <term>, colq: "3"}. On the next pass,> we'll note that each term source is pointing to doc ID 3. This means we've> found an intersection, so the top level IntersectingIterator will return> docID 3.> > When the session requests the next matching docID, the iterator will> advance each iterator by calling next(). The IntersectingIterator now sees> its children are all positioned at docIDs [6, null, 5] (the `null` value> arises because the "Bill" term source doesn't have a key beyond {row:> "partitionN", colf: "Bill", colq: "3"}. This state means that the> intersection is done, because one of the term sources has exhausted its> possible values, so there's no doc ID that will occur in all three lists.

Hi Bill,

Thanks for the great illustration above of the algorithm. I get the feelingthat this algorithm is pretty central to the implementation, and unlikely tochange. I don't feel guilty relying on it for ordering.

Does that assumption sound dangerous?

Assuming it's not. My next question comes from reading more closely thedocumentation for IndexedDocIterator [1], which indicates that we're notindexing raw doc IDs, but at tuple of (docType, docId).

Is this correct?

The documentation is a little ambiguous of what the docType is for, but I'mwondering if I can use it to assert an ordering for the results of anintersecting iterator. For instance, we may use something timestamp-y for thedocType if that's the case.

I'll have prototypes done early next week to confirm my suspicions above, butin the meantime, it's great to get expert feedback.

On Sun, Jul 1, 2012 at 11:57 PM, Sukant Hajra <[EMAIL PROTECTED]> wrote:> Excerpts from Sukant Hajra's message of Thu Jun 28 15:49:11 -0500 2012:>>>> The Accumulo documentation alludes to the problem a little:>>>> If the results are unordered this is quite effective as the first results>> to arrive are as good as any others to the user.>>>> In our case, order matters because we want the last results without pulling in>> everything.>> Actually, I was just thinking about this a little. I don't know if this is> specified in the documentation, but is there /any/ reliable (deterministic)> ordering for the values returned by intersecting iterators?

Unrelated to the intersecting iterator, when using the batch scanneryou can not expect results in order. The batch scanner send querysout to tablet servers in parallel. As batches of key/values arereturned from the tablet server they are immediately made available tothe client. Therefore the client will iterate over interleaved keyvalues from different tablets. The batch scanner is usually used withthe intersecting iterator to parallelize scans. I think this isdocumented in the batch scanner java docs.

If the regular scanner were used, then the client would see the keyvalues in the order returned by the iterator. However, only onetablet would be scanned at a time.

>> If there is, would it be horribly ill-advised to rely on this ordering for> application logic if we got clever with our schema?>> Also, if someone could reply with the exact algorithm for this ordering, it> would help put less burden on us to reverse engineer and/or read the source> code correctly.>> Thanks for your help,> Sukant

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext