Intra-row scanning

Details

Description

To continue scaling numbers of columns or versions in a single row, we need a mechanism to scan within a row so we can return some columns at a time. Currently, an entire row must come back as one piece.

Activity

What we did for Stargate scanners is make them iterators over cells and then allow scanners to specify the number of cells they'd like to have come back in one batch. The internal mechanics are more complicated for region servers to do this, but I think similar semantics would be good. How to handle crossing row boundaries presents a couple of options:

Include row key as well as column and timestamp with each cell value. This is not as expensive as it might sound if a simple string table encoding is used with a marker or two meaning "use last given row key" and "use last given column". Either Thrift or pbufs can handle this by marking row and column keys as optional.

Make Result capable of holding more than one row.

Return early to the client at row boundary and make it do scanner.next() to start up again on the next row.

Andrew Purtell
added a comment - 17/Jun/09 22:06 What we did for Stargate scanners is make them iterators over cells and then allow scanners to specify the number of cells they'd like to have come back in one batch. The internal mechanics are more complicated for region servers to do this, but I think similar semantics would be good. How to handle crossing row boundaries presents a couple of options:
Include row key as well as column and timestamp with each cell value. This is not as expensive as it might sound if a simple string table encoding is used with a marker or two meaning "use last given row key" and "use last given column". Either Thrift or pbufs can handle this by marking row and column keys as optional.
Make Result capable of holding more than one row.
Return early to the client at row boundary and make it do scanner.next() to start up again on the next row.

Result is actually just KeyValue[]... Each KeyValue holds ALL the data, row, family, qualifier, timestamp, value, type. So we already have all the information we need to put things back in any way we want.

The byte [] row inside Result is actually computed when you ask for it, and it just grabs the row from the first KV, so it's actually already capable from a data structure perspective to hold multiple rows.

Jonathan Gray
added a comment - 17/Jun/09 22:12 Result is actually just KeyValue[]... Each KeyValue holds ALL the data, row, family, qualifier, timestamp, value, type. So we already have all the information we need to put things back in any way we want.
The byte [] row inside Result is actually computed when you ask for it, and it just grabs the row from the first KV, so it's actually already capable from a data structure perspective to hold multiple rows.
Good stuff, Andrew.

Patch soon with first cut. Will add a bit of state to scanner and change next() semantics on the client such that more than one call to next may be needed to retireve the full row. Next() will not span rows. New next() behavior will be configurable, off by default, toggled by config var or Scan parameter.

Andrew Purtell
added a comment - 02/Oct/09 19:39 Patch soon with first cut. Will add a bit of state to scanner and change next() semantics on the client such that more than one call to next may be needed to retireve the full row. Next() will not span rows. New next() behavior will be configurable, off by default, toggled by config var or Scan parameter.

stack
added a comment - 13/Oct/09 04:48 Patch looks good. Only think is that maybe the test should check that Result does not contain results that span a row? (We're not supposed to cross rows inside a call to next, right?)

This feature looks very useful for my ongoing project, but the attached patches here can't be applied to my hbase 0.20.3. There're some unmatched hunks in Scan.java and HRegion.java, especially much difference in nextInternal method of HRegion.java. As a common user of HBase, I'm not clear how to adjust the related code. Can anybody provide a patch for 0.20.3? Thanks very much!

Yi Liang
added a comment - 15/Mar/10 01:47 This feature looks very useful for my ongoing project, but the attached patches here can't be applied to my hbase 0.20.3. There're some unmatched hunks in Scan.java and HRegion.java, especially much difference in nextInternal method of HRegion.java. As a common user of HBase, I'm not clear how to adjust the related code. Can anybody provide a patch for 0.20.3? Thanks very much!

I noticed this was dropped in the pivot from 0.20_pre_durability to 0.20. But as a matter of fact we have an open internal ticket on this. Unit tests and bugfixes coming. Will submit to reviewboard when ready.

Andrew Purtell
added a comment - 19/Jun/10 07:16 I noticed this was dropped in the pivot from 0.20_pre_durability to 0.20. But as a matter of fact we have an open internal ticket on this. Unit tests and bugfixes coming. Will submit to reviewboard when ready.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:http://review.hbase.org/r/376/
-----------------------------------------------------------

Review request for hbase.

Summary
-------

Query matcher will be confused if intra-row scanning. Avoid calling setRow() if the row has not changed. This requires a string comparison per next().

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:http://review.hbase.org/r/376/#review463
-----------------------------------------------------------

Ship it!

+1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion).

HBase Review Board
added a comment - 23/Jul/10 06:18 Message from: stack@duboce.net
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/376/#review463
-----------------------------------------------------------
Ship it!
+1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion).
src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
< http://review.hbase.org/r/376/#comment1902 >
Consider using method Ryan added today:
public boolean matchingRows(final KeyValue left, final byte [] right) {
Then you could do if (!KeyValue.matchingRows(matcher.row, peeked)
.. and only do a peeked.getRow when you know you've changed rows..
Might same a bit of byte array making.
stack

> +1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion).

Committed to branch and trunk, taking into account suggestions for savings in both cases. Also added a conditional to insure matcher.row is not null before testing it.

Andrew

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:http://review.hbase.org/r/376/#review463
-----------------------------------------------------------

HBase Review Board
added a comment - 25/Jul/10 17:52 Message from: "Andrew Purtell" <apurtell@apache.org>
On 2010-07-22 23:11:32, stack wrote:
> +1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion).
Committed to branch and trunk, taking into account suggestions for savings in both cases. Also added a conditional to insure matcher.row is not null before testing it.
Andrew
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/376/#review463
-----------------------------------------------------------

This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).

Lars Francke
added a comment - 20/Nov/15 13:01 This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).