Details

Description

Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions.
However, if a client deletes a row all version older than the delete tomb stone will be remove at the next major compaction (and even at memstore flush - see HBASE-4241).
There should be a way to retain those version to guard against software error.

I see two options here:
1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions even past the delete marker.

#1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a user viewpoint)

We were also discussion today that there are situations (especially in multi-master situations) where you want to retain the delete markers for some period of time as well.

I would think this would require both a family-level setting (a new one or the existing one) and also a read-time option, correct? As of now, deletes are never returned to the client. You'd have to return them in this case otherwise the user would have no idea what is actually there? I'm not sure it's fair to ask a user to understand how our delete tombstones work

Jonathan Gray
added a comment - 03/Oct/11 23:36 We were also discussion today that there are situations (especially in multi-master situations) where you want to retain the delete markers for some period of time as well.
I would think this would require both a family-level setting (a new one or the existing one) and also a read-time option, correct? As of now, deletes are never returned to the client. You'd have to return them in this case otherwise the user would have no idea what is actually there? I'm not sure it's fair to ask a user to understand how our delete tombstones work

@Andrew: Agree. There's no point really in setting up #minimumVersion if they get blown away by a delete marker... Could even argue it's a bug in the parent.

@Jon: Replication already replicates deletes. And (luckily ) it has no optimization to skip versions once it found a delete marker. So I think that should already work.
Or are you referring to some other method of multi-master replication. In that case we'd have invent a new way to let a client retrieve delete markers. I think that'd be a different jira.

Lars Hofhansl
added a comment - 04/Oct/11 00:05 @Andrew: Agree. There's no point really in setting up #minimumVersion if they get blown away by a delete marker... Could even argue it's a bug in the parent.
@Jon: Replication already replicates deletes. And (luckily ) it has no optimization to skip versions once it found a delete marker. So I think that should already work.
Or are you referring to some other method of multi-master replication. In that case we'd have invent a new way to let a client retrieve delete markers. I think that'd be a different jira.

Major compaction. If min versions is not set, follow current behavior (filter deleted rows, remove delete markers, like #1). If min versions is set, keep deleted rows, treat delete markers as normal rows (again, do not count them as versions).

In all three cases rows past max versions or expired rows past min versions will be filtered out.

Lars Hofhansl
added a comment - 04/Oct/11 06:52 Started to look at the code.
Scanning now has three different scenarios:
Normal scan. Under all circumstanced filter deleted rows (unless it's a past scan, of course). Do not return delete markers.
Minor compaction. If min versions is not set, follow current behavior (filter deleted rows, keep delete markers). If min versions is set, keep deleted rows, treat delete markers as normal rows (but do not count them as versions). I.e. delete markers are subject to maxversions, expiration, etc.
Major compaction. If min versions is not set, follow current behavior (filter deleted rows, remove delete markers, like #1). If min versions is set, keep deleted rows, treat delete markers as normal rows (again, do not count them as versions).
In all three cases rows past max versions or expired rows past min versions will be filtered out.
Jon could reuse scan type #2 or #3 for his needs, I think.

There is a pathetic scenario where CF has minversions set and a client only write delete markers. As they are not counted as versions, they would never be removed (not even in a major compaction).
I suppose to guard against that case we could remove all trailing delete markers (that are not followed by any surviving kvs).

Lars Hofhansl
added a comment - 04/Oct/11 07:11 There's a fourth scan type:
4. Memstore flush (see HBASE-4241 ). This behaves exactly like #2.
There is a pathetic scenario where CF has minversions set and a client only write delete markers. As they are not counted as versions, they would never be removed (not even in a major compaction).
I suppose to guard against that case we could remove all trailing delete markers (that are not followed by any surviving kvs).

Turns out that currently it is not at all possible to query (get or scan) any rows "behind" a delete marker, even setting a timerange on a Get that is before the delete marker will not return any rows.

Lars Hofhansl
added a comment - 04/Oct/11 21:01 Turns out that currently it is not at all possible to query (get or scan) any rows "behind" a delete marker, even setting a timerange on a Get that is before the delete marker will not return any rows.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

Review request for hbase.

Summary
-------

In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.

The change is pretty small, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 04/Oct/11 21:28
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
Review request for hbase.
Summary
-------
In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).
This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
The change is pretty small, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.
Thanks,
Lars

In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

This changes the following:

1. MinVersions now also means: Keep this number of version even when the row was deleted.

2. Allow gets/scans to retrieve rows hidden by a delete marker.

3. Do not unconditionally collect all deleted rows during a compaction.

The change is pretty small, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 04/Oct/11 22:18
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2323
-----------------------------------------------------------
Nice feature.
Should a new config parameter be introduced ?
Ted
On 2011-10-04 21:27:20, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-04 21:27:20)
Review request for hbase.
Summary
-------
In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).
This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
The change is pretty small, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.
Thanks,
Lars

Re: "New config param". Wasn't sure in the beginning, but now I think this should have really been part of the HBASE-4071.
With config param you mean a new flag on HColumnDescriptor? That would provide for more flexibility, the question is whether it is needed, or would just confuse any users (the minversion feature is confusing as is ).
I'm not opposed to the separate option for this, though.

Lars Hofhansl
added a comment - 05/Oct/11 00:21 Re: "New config param". Wasn't sure in the beginning, but now I think this should have really been part of the HBASE-4071 .
With config param you mean a new flag on HColumnDescriptor? That would provide for more flexibility, the question is whether it is needed, or would just confuse any users (the minversion feature is confusing as is ).
I'm not opposed to the separate option for this, though.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-05 01:06:33.205905)

Review request for hbase.

Changes
-------

Version that actually compiles.

Summary
-------

In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.

The change is pretty small, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 05/Oct/11 01:07
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-05 01:06:33.205905)
Review request for hbase.
Changes
-------
Version that actually compiles.
Summary
-------
In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).
This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
The change is pretty small, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.
Thanks,
Lars

Lars Hofhansl
added a comment - 05/Oct/11 16:51 This causes a slight incompatible change.
Queries in the past before a delete marker will return rows before he marker (this is as it should be IMHO, but it is a change from the current behavior).

Lars Hofhansl
added a comment - 05/Oct/11 17:20 You are right.
I have a new version of the patch now that only does that when minimum versions is enabled for the CF. As minversions is not released, yet, this should be ok.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-05 18:16:19.303065)

Review request for hbase.

Changes
-------

All tests pass now (except TestAdmin, which fails with or without my changes).

Summary
-------

In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.

The change is pretty small, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 05/Oct/11 18:16
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-05 18:16:19.303065)
Review request for hbase.
Changes
-------
All tests pass now (except TestAdmin, which fails with or without my changes).
Summary
-------
In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).
This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
The change is pretty small, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
TestMinVersions (with a new test for this) passed. Have not run full suite, yet. This is for initial feedback.
Thanks,
Lars

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-05 18:17:40.058774)

Review request for hbase.

Summary (updated)
-------

In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).

This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker (only if minVersions is set for the CF).
3. Do not unconditionally collect all deleted rows during a compaction.

The change is pretty small, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 05/Oct/11 18:18
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-05 18:17:40.058774)
Review request for hbase.
Summary (updated)
-------
In order to use a replicated cluster as backup and to make true use of HBase's timestamps it has to be possible to get/scan rows that are hidden by a delete marker. If a marker was placed at time T, it should be possible to retrieve rows with get/scan timerange of [0-T).
This changes the following:
1. MinVersions now also means: Keep this number of version even when the row was deleted.
2. Allow gets/scans to retrieve rows hidden by a delete marker (only if minVersions is set for the CF).
3. Do not unconditionally collect all deleted rows during a compaction.
The change is pretty small, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1178891
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1178891
Diff: https://reviews.apache.org/r/2178/diff
Testing (updated)
-------
All test (except TestAdmin, which fail with or without my change) pass now.
Thanks,
Lars

Thinking about this a bit more. I think this would better be covered by a separate flag.

The reason is this: Users might set TTL for a store in order to recreate any state of the store within that TTL. This does not currently work deleted rows.
MinVersins is somewhat orthogonal to this.
If KEEP_DELETED_ROWS (or however we want to call this) is set on a CF, it would just mean that deleted rows and the delete markers are subject to the usual MaxVersions, TTL/MinVersions rules.

Do people think this is not a good idea generally, and this should be rather handled by the application - by simply not deleting anything? (there's fairly little discussion here)

Lars Hofhansl
added a comment - 06/Oct/11 05:38 Thinking about this a bit more. I think this would better be covered by a separate flag.
The reason is this: Users might set TTL for a store in order to recreate any state of the store within that TTL. This does not currently work deleted rows.
MinVersins is somewhat orthogonal to this.
If KEEP_DELETED_ROWS (or however we want to call this) is set on a CF, it would just mean that deleted rows and the delete markers are subject to the usual MaxVersions, TTL/MinVersions rules.
Do people think this is not a good idea generally, and this should be rather handled by the application - by simply not deleting anything? (there's fairly little discussion here)

Ted Yu
added a comment - 06/Oct/11 05:50 Introducing new flag would allow maximum flexibility.
For column family where the flag isn't specified, we can handle deleted rows and the delete markers according to the presence of minVersions.

That might be getting hard to understand. minVersions would have slightly different meaning depending on whether that extra flag is set. Without the flag minVersions is like "maxVersions for deleted rows", not sure who would need that.

Having just the flag for deleted rows would also make the code easier to follow as the column tracker would no longer need to distinguish between "normal" rows, delete markers, and deleted rows as it does in the current patch; but only between rows (deleted or not) and delete markers.

Lars Hofhansl
added a comment - 06/Oct/11 06:29 That might be getting hard to understand. minVersions would have slightly different meaning depending on whether that extra flag is set. Without the flag minVersions is like "maxVersions for deleted rows", not sure who would need that.
Having just the flag for deleted rows would also make the code easier to follow as the column tracker would no longer need to distinguish between "normal" rows, delete markers, and deleted rows as it does in the current patch; but only between rows (deleted or not) and delete markers.

Turns out this is a bit more complicated than I thought. There are three types of deletes:

version deletes - effective for a specific version of a specific column

column deletes - effective for all versions of a specific column

family deletes - effective for all versions of all columns of a family

The first two are sorted before the puts they affect based on their resp. timestamps, but after newer puts.
Family deletes, always sort before all versions of all columns.

The problems is deciding when the delete rows (the marker rows) themselves can be removed during a major compaction.

For #1 and #2 I can just do version counting, and newer puts will eventually push out the delete markers from the store.
With #3 this will never happen as they always sort before all puts of the same family, regardless of any timestamp set on them.
Here it is necessary to scan all puts for that family and then decide whether the delete needs to be included based on whether the delete had any affect on any of the puts in the same family.

Because of this, moving out of 0.92 as changes will be bigger. Put back if you think otherwise.

I still think that timetravel is an important feature of HBase and incomplete if it cannot include deleted rows.

Lars Hofhansl
added a comment - 06/Oct/11 21:07 Turns out this is a bit more complicated than I thought. There are three types of deletes:
version deletes - effective for a specific version of a specific column
column deletes - effective for all versions of a specific column
family deletes - effective for all versions of all columns of a family
The first two are sorted before the puts they affect based on their resp. timestamps, but after newer puts.
Family deletes, always sort before all versions of all columns.
The problems is deciding when the delete rows (the marker rows) themselves can be removed during a major compaction.
For #1 and #2 I can just do version counting, and newer puts will eventually push out the delete markers from the store.
With #3 this will never happen as they always sort before all puts of the same family, regardless of any timestamp set on them.
Here it is necessary to scan all puts for that family and then decide whether the delete needs to be included based on whether the delete had any affect on any of the puts in the same family.
Because of this, moving out of 0.92 as changes will be bigger. Put back if you think otherwise.
I still think that timetravel is an important feature of HBase and incomplete if it cannot include deleted rows.

Jonathan Gray
added a comment - 06/Oct/11 21:19 Lars, I agree that this is an important feature. Also agree that we should take time and do it right and not push for 0.92.
Could we just support some kind of "raw scanner" along with a TTKAKV config (Time To Keep All Key Values)?

A simple option would be to only allow the "KEEP_DELETED" flag set when TTL is also set.
Then we'd do simple version counting for #1 and #2 type deletes and rely on TTL to expire #3.
(That means you could have more #3 delete markers than max versions, which would also be the case with TTKAKV. That might be acceptable).

Lars Hofhansl
added a comment - 06/Oct/11 21:37 A simple option would be to only allow the "KEEP_DELETED" flag set when TTL is also set.
Then we'd do simple version counting for #1 and #2 type deletes and rely on TTL to expire #3.
(That means you could have more #3 delete markers than max versions, which would also be the case with TTKAKV. That might be acceptable).

Another option for the problem above is, to not write the family delete marker during compaction but to instead translate them to column delete marker as we scan through the kvs following a family delete marker.

Lars Hofhansl
added a comment - 07/Oct/11 16:33 Another option for the problem above is, to not write the family delete marker during compaction but to instead translate them to column delete marker as we scan through the kvs following a family delete marker.

Yet another option is to take the smallest time of any of the store files and remove the family markers if they are older than that. The marker may survive two compactions in that case, but eventually they'll be removed.

In addition to address Jon's need, I think we can add a raw flag to the Scan object. If true, the scan will retrieve all available rows including deleted rows and delete markers. With the rest of the changes from this patch, that would be really easy to do.
(I assume I'd have to increment the SCAN_VERSION, correct?)

Lars Hofhansl
added a comment - 08/Oct/11 04:17 Yet another option is to take the smallest time of any of the store files and remove the family markers if they are older than that. The marker may survive two compactions in that case, but eventually they'll be removed.
In addition to address Jon's need, I think we can add a raw flag to the Scan object. If true, the scan will retrieve all available rows including deleted rows and delete markers. With the rest of the changes from this patch, that would be really easy to do.
(I assume I'd have to increment the SCAN_VERSION, correct?)

Was toying with keeping track of the first key written to a file (we currently keep track of the last key).
However, since the family delete hides all puts with a TS less than or equal to the delete TS, this won't help as the delete marker itself eventually would be the first key and it still might affect puts later in the file.

Also KVs are sorted in reverse time order (except family delete markers), so I cannot just look at the put directly following the delete marker, because it'll be the newest put rather then the oldest.

So there are three options I think:
1. Only allow the new flag set on CFs with TTL set. MIN_VERSIONS would not apply to deleted rows or delete marker rows (wouldn't know how long to keep family deletes in that case). (MAX)VERSIONS would still be enforced on all rows types except for family delete markers.
2. Translate family delete markers to column delete marker at (major) compaction time.
3. Change HFileWriterV* to keep track of the earliest put TS in a store and write it to the file metadata. Use that use expire delete marker that are older and hence can't affect any puts in the file.

None of these are particularly attractive. #1 is limiting, #2 might get expensive if there're many columns, #3 would require the FileWriters to understand KVs.

Lars Hofhansl
added a comment - 10/Oct/11 18:29 This issue is pretty resistant to all my attempts to solve it.
Was toying with keeping track of the first key written to a file (we currently keep track of the last key).
However, since the family delete hides all puts with a TS less than or equal to the delete TS, this won't help as the delete marker itself eventually would be the first key and it still might affect puts later in the file.
Also KVs are sorted in reverse time order (except family delete markers), so I cannot just look at the put directly following the delete marker, because it'll be the newest put rather then the oldest.
So there are three options I think:
1. Only allow the new flag set on CFs with TTL set. MIN_VERSIONS would not apply to deleted rows or delete marker rows (wouldn't know how long to keep family deletes in that case). (MAX)VERSIONS would still be enforced on all rows types except for family delete markers.
2. Translate family delete markers to column delete marker at (major) compaction time.
3. Change HFileWriterV* to keep track of the earliest put TS in a store and write it to the file metadata. Use that use expire delete marker that are older and hence can't affect any puts in the file.
None of these are particularly attractive. #1 is limiting, #2 might get expensive if there're many columns, #3 would require the FileWriters to understand KVs.

Lars Hofhansl
added a comment - 10/Oct/11 18:35 and
3a. Have Store.java keep track of the earliest put in internalFlushCache and compactStore and then append it to the file metadata. That way HFileWriterV* would not need to know about KVs.

Lars Hofhansl
added a comment - 11/Oct/11 03:52 That's what I concluded shortly after I posted that.
I made the changes, also added a "raw" option to Scan. All tests pass, including a bunch of new tests.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-11 04:02:54.234402)

Review request for hbase.

Changes
-------

Added a specific flag for HColumnFamilty to enable this.
Family delete markers are now also collected. The way that is done is a bit more complicated than I had hoped.
o StoreFile.writer keeps track of the lowest timestamp of any Put seen and stores in the files metadata.
o Upon major compaction the lowest of all put timestamps of all involved storefiles is passed to the StoreScanner, which can that remove any family delete older than that timestamp.

Tests:
...
Results :

Tests run: 1033, Failures: 0, Errors: 0, Skipped: 16

Summary (updated)
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 11/Oct/11 04:04
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-11 04:02:54.234402)
Review request for hbase.
Changes
-------
Added a specific flag for HColumnFamilty to enable this.
Family delete markers are now also collected. The way that is done is a bit more complicated than I had hoped.
o StoreFile.writer keeps track of the lowest timestamp of any Put seen and stores in the files metadata.
o Upon major compaction the lowest of all put timestamps of all involved storefiles is passed to the StoreScanner, which can that remove any family delete older than that timestamp.
Tests:
...
Results :
Tests run: 1033, Failures: 0, Errors: 0, Skipped: 16
Summary (updated)
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181286
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181286
Diff: https://reviews.apache.org/r/2178/diff
Testing (updated)
-------
All tests pass now.
Thanks,
Lars

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-11 04:21:07.770408)

Review request for hbase.

Changes
-------

Addressing anticipated nits

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 11/Oct/11 04:22
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-11 04:21:07.770408)
Review request for hbase.
Changes
-------
Addressing anticipated nits
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 11/Oct/11 06:04
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2501
-----------------------------------------------------------
Elegant solution.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
< https://reviews.apache.org/r/2178/#comment5660 >
Is this changed required by the feature ?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
< https://reviews.apache.org/r/2178/#comment5661 >
'the' is redundant.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5662 >
Timerange would be better than timetravel.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
< https://reviews.apache.org/r/2178/#comment5663 >
Would EARLIEST_PUT_TS be a better name ?
Ted
On 2011-10-11 04:21:07, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-11 04:21:07)
Review request for hbase.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 11/Oct/11 06:14
On 2011-10-11 06:02:38, Ted Yu wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 106
> < https://reviews.apache.org/r/2178/diff/6/?file=49393#file49393line106 >
>
> Timerange would be better than timetravel.
Agreed.
On 2011-10-11 06:02:38, Ted Yu wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 321
> < https://reviews.apache.org/r/2178/diff/6/?file=49389#file49389line321 >
>
> Is this changed required by the feature ?
No... But it didn't make sense before. Setting minVersions equal to maxVersions is the same as not having TTL.
On 2011-10-11 06:02:38, Ted Yu wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java , line 127
> < https://reviews.apache.org/r/2178/diff/6/?file=49396#file49396line127 >
>
> Would EARLIEST_PUT_TS be a better name ?
Probably.
Lars
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2501
-----------------------------------------------------------
On 2011-10-11 04:21:07, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-11 04:21:07)
Review request for hbase.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-11 06:20:01.186435)

Review request for hbase.

Changes
-------

o Addressed Ted's point.
o Also added this flag to the shell.

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 11/Oct/11 06:20
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-11 06:20:01.186435)
Review request for hbase.
Changes
-------
o Addressed Ted's point.
o Also added this flag to the shell.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 11/Oct/11 06:22
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2503
-----------------------------------------------------------
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
< https://reviews.apache.org/r/2178/#comment5667 >
So minVersions == maxVersions == 1 would no longer be accepted ...
Ted
On 2011-10-11 06:20:01, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-11 06:20:01)
Review request for hbase.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 11/Oct/11 06:36
On 2011-10-11 06:21:39, Ted Yu wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 321
> < https://reviews.apache.org/r/2178/diff/6/?file=49389#file49389line321 >
>
> So minVersions == maxVersions == 1 would no longer be accepted ...
Correct. TTL<FOREVER + maxVersions=1 + minVersion=1 would be identical to just setting maxVersions=1 (without TTL or minVersions)
Lars
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2503
-----------------------------------------------------------
On 2011-10-11 06:20:01, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-11 06:20:01)
Review request for hbase.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

Lars Hofhansl
added a comment - 11/Oct/11 07:10 You mean set TTL to FOREVER and minVersions to 0? Sure that would work.
On the other hand it indicates that the user did not understand how minVersions works, so I could see an error as well.

Ted Yu
added a comment - 11/Oct/11 07:15 Personally I would love the software which can interpret my expression correctly even though I don't fully know how it works
It's your call what to do here, Lars.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-12 04:55:53.069011)

Review request for hbase, Ted Yu and Jonathan Gray.

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 12/Oct/11 04:57
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-12 04:55:53.069011)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward.

There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan.

Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course).

I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions?

How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having:

jiraposter@reviews.apache.org
added a comment - 15/Oct/11 04:00
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2609
-----------------------------------------------------------
Great stuff.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
< https://reviews.apache.org/r/2178/#comment5854 >
Javadoc what the new param means
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
< https://reviews.apache.org/r/2178/#comment5855 >
So you made this change to make it so user doesn't have condition where they are scratching their head trying to figure why something is not being let go though TTL is four hours?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
< https://reviews.apache.org/r/2178/#comment5856 >
Is this value read often? If so you might want to cache it. Profiling these things can sometimes show up as costing.
Not important but if you get to make new patch do
if (test) block
or
if (test)
{
block
}
rather than the
if (test)
block
you have.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
< https://reviews.apache.org/r/2178/#comment5857 >
Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
< https://reviews.apache.org/r/2178/#comment5858 >
Good.
There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan.
Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course).
I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions?
How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having:
maxversions... delete marker .... maxversions ... delete marker .... maxversions...
and so on if this facility is enabled?
I won't have:
as many versions as was written but we only return maxversions......delete marker...... as many versions as were written...... delete marker......
Is that right? (If you follow me).
Say 'garbage collected' in the comment rather than just 'collected'
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
< https://reviews.apache.org/r/2178/#comment5859 >
Bad name for a param. This is name you'd give the getter for a param named 'delete'.
What is this param doing? Is this the raw attribute or is it from HCD saying keep deletes? Make it clear in javadoc.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
< https://reviews.apache.org/r/2178/#comment5860 >
good
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
< https://reviews.apache.org/r/2178/#comment5861 >
good
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5862 >
This stuff needs to make it out into the hbase book. Good stuff.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5863 >
So when I am doing a check for max versions = 3 and I have for a single cell:
put ... put....delete....put
If I keep delete markers and do a raw scan, I'll get: p p d
If I have keep delete markers but not a raw scan I get: p p
If I do not have keep delete markers and its not an ordinary scan: p p
Something like that?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5864 >
I'm not sure I get what this one is about.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5866 >
An 'of' should be an 'if' here?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5867 >
These params on an SQM are starting to build up.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5868 >
I don't understand whats up here. Why we changing timerange if seePastDeleteMarker is set?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5869 >
Is 'keepDeletedRows' a bad name for this variable? Should it be keepDeletedCells?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5870 >
So, this is a delete kv at this stage of the processing? So, if its older than any put in any storefile on major compactions, we can let it go since has no effect? Thats good.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment5871 >
This makes me nervous. All old tests pass on this code? You have a unit test to test that we are doing the right thing in here both for the old and the new handling?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
< https://reviews.apache.org/r/2178/#comment5872 >
Again, bad name for param.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
< https://reviews.apache.org/r/2178/#comment5873 >
Is this right? We should count it as a version if keepdeletes is on? Unless this flag is saying 'keepdeletes' and not 'this is a delete kv' as I'm interpreting it).
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
< https://reviews.apache.org/r/2178/#comment5874 >
No biggie... previous formatting was better.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
< https://reviews.apache.org/r/2178/#comment5875 >
This belongs in HCD rather than up here in this global HConstants?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
< https://reviews.apache.org/r/2178/#comment5876 >
Why we move this to the end?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb
< https://reviews.apache.org/r/2178/#comment5877 >
Very nice.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
< https://reviews.apache.org/r/2178/#comment5878 >
Good test
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
< https://reviews.apache.org/r/2178/#comment5879 >
White space
Michael
On 2011-10-12 04:55:53, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-12 04:55:53)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

Thanks for the thorough review. I since wrote a bunch of more tests and that way found some more problems.
I'll add comments below and also attach a new patch soon.

What I do not like about this patch is that "entropy" in the ScanQueryMatcher now. There are too many of statements and the logic is very hard to follow. On the hand I have not found a way to make it simpler.

> Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward.

I was thinking to document this in the Javadoc of Attributes.java. There's already DEFAULT_CLUSTER_ID in puts, which has an attribute prefixes by on _, I can change that to two _. The trailing underscore does not matter, I guess.

> There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan.

>

> Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course).

>

> I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions?

>

> How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having:

> as many versions as was written but we only return maxversions......delete marker...... as many versions as were written...... delete marker......

>

> Is that right? (If you follow me).

>

> Say 'garbage collected' in the comment rather than just 'collected'

They way it works now, is that it is guaranteed to see all relevant deletes.
Say VERSIONS=2 and you have
put, put, put, delete, delete (all of the same row). You would only get the puts back, because the last put and the deletes are guaranteed to have no effect.

Maybe also have something like: delete, delete, delete, put, delete. In that case you get everything, as it cannot be shown that any of the deletes is not significant.

In any case, deletes do not count towards the version number of following puts or deletes.

> I don't understand whats up here. Why we changing timerange if seePastDeleteMarker is set?

Ah yes. This is why the scanner needs to know whether we want to be able scan past delete markers. The *afterTimeRange will add all delete markers > some TS to the set of deletes and hence hide all earlier puts. In order to allow range queries before a delete marker the marker must only be included if it is inside the time range of the query.

> So, this is a delete kv at this stage of the processing? So, if its older than any put in any storefile on major compactions, we can let it go since has no effect? Thats good.

For this to work I added a new meta data item to StoreFiles: the timestamp of the earliest put found. Not pretty, but I cannot think of anything else.
This is used for family delete marker because they are always before all puts regardless of timestamp

jiraposter@reviews.apache.org
added a comment - 15/Oct/11 05:24
On 2011-10-15 04:00:08, Michael Stack wrote:
> Great stuff.
Thanks for the thorough review. I since wrote a bunch of more tests and that way found some more problems.
I'll add comments below and also attach a new patch soon.
What I do not like about this patch is that "entropy" in the ScanQueryMatcher now. There are too many of statements and the logic is very hard to follow. On the hand I have not found a way to make it simpler.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 322
> < https://reviews.apache.org/r/2178/diff/7/?file=49403#file49403line322 >
>
> So you made this change to make it so user doesn't have condition where they are scratching their head trying to figure why something is not being let go though TTL is four hours?
Exactly. MinVersion = MaxVersion makes no sense. I can do that in a separate jira and for 0.92 too.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 556
> < https://reviews.apache.org/r/2178/diff/7/?file=49403#file49403line556 >
>
> Is this value read often? If so you might want to cache it. Profiling these things can sometimes show up as costing.
>
> Not important but if you get to make new patch do
>
> if (test) block
>
> or
>
> if (test) {
bq. > block
bq. > }
>
> rather than the
>
> if (test)
> block
>
> you have.
This is only read by the Store.<init>, so should be OK.
Absolutely agree of the if layout, this somehow sneaked in here.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java , line 86
> < https://reviews.apache.org/r/2178/diff/7/?file=49404#file49404line86 >
>
> Let this be the convention from here on out. Maybe even two leading underscores and no trailing underscore... but whatever, let this be the convention for system configs in attribute maps going forward.
I was thinking to document this in the Javadoc of Attributes.java. There's already DEFAULT_CLUSTER_ID in puts, which has an attribute prefixes by on _, I can change that to two _. The trailing underscore does not matter, I guess.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java , line 711
> < https://reviews.apache.org/r/2178/diff/7/?file=49404#file49404line711 >
>
> Good.
>
> There is a real old feature request in hbase that asked for this.. let me find the issue.... I can't find it.... user wanted to see deletes too in a scan.
>
> Then with Benoit a week or so back we had to look in hfile because we didn't trust a delete.... we thought there stuff behind it. We could have used this facility to show all that was in hbase (if we'd enabled this new feature of course).
>
> I think this facility should be on by default. Whats the downsides? We don't let go of stuff if < maxversions?
>
> How you going to do versions now? If I ask for a scan beyond a delete will I see maxversions only because thats all we keep between deletes? Will it be the case that I will end up having:
>
> maxversions... delete marker .... maxversions ... delete marker .... maxversions...
>
> and so on if this facility is enabled?
>
> I won't have:
>
> as many versions as was written but we only return maxversions......delete marker...... as many versions as were written...... delete marker......
>
> Is that right? (If you follow me).
>
> Say 'garbage collected' in the comment rather than just 'collected'
They way it works now, is that it is guaranteed to see all relevant deletes.
Say VERSIONS=2 and you have
put, put, put, delete, delete (all of the same row). You would only get the puts back, because the last put and the deletes are guaranteed to have no effect.
Maybe also have something like: delete, delete, delete, put, delete. In that case you get everything, as it cannot be shown that any of the deletes is not significant.
In any case, deletes do not count towards the version number of following puts or deletes.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 59
> < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line59 >
>
> So when I am doing a check for max versions = 3 and I have for a single cell:
>
>
> put ... put....delete....put
>
>
> If I keep delete markers and do a raw scan, I'll get: p p d
>
> If I have keep delete markers but not a raw scan I get: p p
>
> If I do not have keep delete markers and its not an ordinary scan: p p
>
> Something like that?
Exactly. That's why the login here is a bit convoluted now.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 60
> < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line60 >
>
> I'm not sure I get what this one is about.
This is what enabled "ordinary" scan to see rows behind a delete marker. Ted suggested to relabel those to "time range" queries.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 222
> < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line222 >
>
> I don't understand whats up here. Why we changing timerange if seePastDeleteMarker is set?
Ah yes. This is why the scanner needs to know whether we want to be able scan past delete markers. The *afterTimeRange will add all delete markers > some TS to the set of deletes and hence hide all earlier puts. In order to allow range queries before a delete marker the marker must only be included if it is inside the time range of the query.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 232
> < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line232 >
>
> Is 'keepDeletedRows' a bad name for this variable? Should it be keepDeletedCells?
yes
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 235
> < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line235 >
>
> So, this is a delete kv at this stage of the processing? So, if its older than any put in any storefile on major compactions, we can let it go since has no effect? Thats good.
For this to work I added a new meta data item to StoreFiles: the timestamp of the earliest put found. Not pretty, but I cannot think of anything else.
This is used for family delete marker because they are always before all puts regardless of timestamp
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 243
> < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line243 >
>
> This makes me nervous. All old tests pass on this code? You have a unit test to test that we are doing the right thing in here both for the old and the new handling?
All tests pass
This is because delete markers can now fall through and be subject to this test as well.
Looking at that now, though, this not necessary. I'll remove it, and rerun my test.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java , line 73
> < https://reviews.apache.org/r/2178/diff/7/?file=49408#file49408line73 >
>
> Again, bad name for param.
Changed already.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java , line 77
> < https://reviews.apache.org/r/2178/diff/7/?file=49408#file49408line77 >
>
> Is this right? We should count it as a version if keepdeletes is on? Unless this flag is saying 'keepdeletes' and not 'this is a delete kv' as I'm interpreting it).
That would certainly make things easier. But would it be right?
If VERSIONS is 3 and I have
delete, put, delete, put, then the last put would be forever unreachable.
If we think that's OK, I can simplify the code.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java , line 487
> < https://reviews.apache.org/r/2178/diff/7/?file=49409#file49409line487 >
>
> No biggie... previous formatting was better.
Heh, OK.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java , line 371
> < https://reviews.apache.org/r/2178/diff/7/?file=49411#file49411line371 >
>
> Why we move this to the end?
Oops... Leftover from other attempts. Will move it back.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java , line 133
> < https://reviews.apache.org/r/2178/diff/7/?file=49415#file49415line133 >
>
> Good test
I add way more tests... new patch coming soon.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java , line 127
> < https://reviews.apache.org/r/2178/diff/7/?file=49410#file49410line127 >
>
> This belongs in HCD rather than up here in this global HConstants?
Should it? It's very store file specific.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java , line 50
> < https://reviews.apache.org/r/2178/diff/7/?file=49405#file49405line50 >
>
> Bad name for a param. This is name you'd give the getter for a param named 'delete'.
>
> What is this param doing? Is this the raw attribute or is it from HCD saying keep deletes? Make it clear in javadoc.
I changed the logic now to actually pass in the type of the kv.
On 2011-10-15 04:00:08, Michael Stack wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 106
> < https://reviews.apache.org/r/2178/diff/7/?file=49407#file49407line106 >
>
> These params on an SQM are starting to build up.
Yes... And I do not like it... At all!
Maybe we should pass in the Store itself.
I was also thinking about an enum for the scan type too. There are seven different cases:
1. major colection, not keeping deleted rows
2. major colection, keeping deleted rows
3. minor colection, not keeping deleted rows
4. minor colection, keeping deleted rows
5. ordinary scan, with delete peeking
6. ordinary scan, without delete peeking
7. raw scan
Lars
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2609
-----------------------------------------------------------
On 2011-10-12 04:55:53, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-12 04:55:53)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1181616
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1181616
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-15 05:30:12.128914)

Review request for hbase, Ted Yu and Jonathan Gray.

Changes
-------

Many more tests. Found some bugs with these tests: deletes and put may have the same TS, the scanner would think it's a duplicate row and remove it.

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 15/Oct/11 05:30
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-15 05:30:12.128914)
Review request for hbase, Ted Yu and Jonathan Gray.
Changes
-------
Many more tests. Found some bugs with these tests: deletes and put may have the same TS, the scanner would think it's a duplicate row and remove it.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1183459
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183459
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

Ted Yu
added a comment - 15/Oct/11 12:30 Continuing discussion started 11/Oct/11 06:45, I think the following check at line 321:
if (minVersions >= maxVersions) {
should at least be lifted above the check at line 318:
if (timeToLive == HConstants.FOREVER) {

Ted Yu
added a comment - 15/Oct/11 12:41 I was also thinking about an enum for the scan type too. There are seven different cases:
Since the cases span multiple dimensions (compaction, keeping deleted rows), enum is not good candidate.
How about introducing MatcherConfig so that new parameters can be added relatively easily in the future ?

@Ted, I can change the order in which the checks are made. MIN_VERSIONS make no sense without TTL, thouhg.

@Ted and @Stack: If we indeed make all of this the default behavior a lot of the if statements and parameters in ScanQueryMatcher would go away, simplifying the logic a LOT.
That would be mean that every store would keep deleted rows (until VERSIONS, or TTL removes them), and scanners would always be able to peek behind delete markers.

Lars Hofhansl
added a comment - 15/Oct/11 23:25 @Ted, I can change the order in which the checks are made. MIN_VERSIONS make no sense without TTL, thouhg.
@Ted and @Stack: If we indeed make all of this the default behavior a lot of the if statements and parameters in ScanQueryMatcher would go away, simplifying the logic a LOT.
That would be mean that every store would keep deleted rows (until VERSIONS, or TTL removes them), and scanners would always be able to peek behind delete markers.

Fair enough. Was picking up on Stack's suggestion to have this on by default. Just means the code has to distinguish between minor and major compaction scans, raw scans, and normal user scans, all of which can be for a store with keep_delete_cells enabled or not.

Thinking about a ScanConfig (or ScanInfo) as static inner class of Store. That would capture all immutable scan-relevant information about the Store (min/max version, family name, ttl, keep_deletes, comparator). (A MatcherConfig with all information would need to be mutable and created or changed for every scan.)
And then maybe a ScanType enum to distinguish between compaction scans and user initiated scans.

What about Stack's suggested in the review to include delete cells in the version count? (The only strange part with that is that the family markers are always in the beginning).
Right now a delete cell does not increase the version count and instead "inherits" the version of the last put.

Lars Hofhansl
added a comment - 16/Oct/11 05:54 Fair enough. Was picking up on Stack's suggestion to have this on by default. Just means the code has to distinguish between minor and major compaction scans, raw scans, and normal user scans, all of which can be for a store with keep_delete_cells enabled or not.
Thinking about a ScanConfig (or ScanInfo) as static inner class of Store. That would capture all immutable scan-relevant information about the Store (min/max version, family name, ttl, keep_deletes, comparator). (A MatcherConfig with all information would need to be mutable and created or changed for every scan.)
And then maybe a ScanType enum to distinguish between compaction scans and user initiated scans.
What about Stack's suggested in the review to include delete cells in the version count? (The only strange part with that is that the family markers are always in the beginning).
Right now a delete cell does not increase the version count and instead "inherits" the version of the last put.

Ted Yu
added a comment - 16/Oct/11 14:50 Thinking about a ScanConfig (or ScanInfo)
ScanInfo seems to be a better name.
And then maybe a ScanType enum
This is good.
a delete cell does not increase the version count
This should be fine.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-17 05:32:49.376397)

Review request for hbase, Ted Yu and Jonathan Gray.

Changes
-------

New day, new version of the patch

o Added yet more tests.
o Introduced class Store.ScanInfo and enum StoreScanner.ScanType to make more sense of the options passed to ScanQueryMatcher.

This is hopefully close.

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 17/Oct/11 05:34
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-17 05:32:49.376397)
Review request for hbase, Ted Yu and Jonathan Gray.
Changes
-------
New day, new version of the patch
o Added yet more tests.
o Introduced class Store.ScanInfo and enum StoreScanner.ScanType to make more sense of the options passed to ScanQueryMatcher.
This is hopefully close.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1184947
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 17/Oct/11 06:44
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2616
-----------------------------------------------------------
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
< https://reviews.apache.org/r/2178/#comment5911 >
Please add 'any' in front of columns and change columns to singular form.
Maybe columns.size should be checked against 0 as well ?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
< https://reviews.apache.org/r/2178/#comment5912 >
Much shorter and readable now.
Ted
On 2011-10-17 05:32:49, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-17 05:32:49)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1184947
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1184947
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-18 19:29:30.328890)

Review request for hbase, Ted Yu and Jonathan Gray.

Changes
-------

This one passes all tests again. Added some more comments.

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 18/Oct/11 19:31
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 19:29:30.328890)
Review request for hbase, Ted Yu and Jonathan Gray.
Changes
-------
This one passes all tests again. Added some more comments.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-18 21:29:47.823962)

Review request for hbase, Ted Yu and Jonathan Gray.

Changes
-------

Addressed all of Ted's comments. Added one more basic test for existing behavior.

This could be the one

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 18/Oct/11 21:31
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:29:47.823962)
Review request for hbase, Ted Yu and Jonathan Gray.
Changes
-------
Addressed all of Ted's comments. Added one more basic test for existing behavior.
This could be the one
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:https://reviews.apache.org/r/2178/
-----------------------------------------------------------

(Updated 2011-10-18 21:43:38.886408)

Review request for hbase, Ted Yu and Jonathan Gray.

Changes
-------

Arrghhh... Fixed one more Javadoc typo.

Summary
-------

HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.

This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.

The change is small'ish, but the logic is intricate, so please review carefully.

jiraposter@reviews.apache.org
added a comment - 18/Oct/11 21:45
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:43:38.886408)
Review request for hbase, Ted Yu and Jonathan Gray.
Changes
-------
Arrghhh... Fixed one more Javadoc typo.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs (updated)
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 19/Oct/11 21:17
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2677
-----------------------------------------------------------
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
< https://reviews.apache.org/r/2178/#comment6040 >
If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well?
(If both gets and deletes are able to fetch the deleted cells then why keep delete markers?)
Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
< https://reviews.apache.org/r/2178/#comment6038 >
Why only ExplicitColumnTracker? A "delete family marker" that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right?
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment6037 >
Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment6039 >
If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()?
Prakash
On 2011-10-18 21:43:38, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:43:38)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

> If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well?

>

> (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?)

>

> Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell?

Gets and Scans only see deleted rows when they use a time range that ends before the resp. delete marker.
The idea is that by using timerange [0,T+1), you can query the system for the state it was in at time T.
If that range includes the delete marker you won't see the deleted rows, if it does not you will.

> Why only ExplicitColumnTracker? A "delete family marker" that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right?

During compactions the ScanWildcardColumnTracker is used, always.
For the new "raw" scans I simply do not want to support ExplicitColumnTracker. There's a check that prevents a "raw" scan from adding any columns.

Note that a "raw" scan is not the same as a time range scan/get. Jon G suggested that'd be a good thing to have, and it was easy to add. A "raw" scan sees everything. Including the delete markers, and could be used (for example) to do custom replication which includes delete markers and deleted rows.

ScanWildcardColumnTracker has all the new logic to deal with delete markers and deleted rows.

jiraposter@reviews.apache.org
added a comment - 19/Oct/11 22:33
On 2011-10-19 21:16:42, Prakash Khemani wrote:
>
Thanks for the review Prakash.
On 2011-10-19 21:16:42, Prakash Khemani wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java , line 160
> < https://reviews.apache.org/r/2178/diff/12/?file=50948#file50948line160 >
>
> If KEEP_DELETED_CELLS is true then what will be the behavior of GETs? Will gets be able to reach the deleted cells as well?
>
> (If both gets and deletes are able to fetch the deleted cells then why keep delete markers?)
>
> Can the value of KEEP_DELETD_CELLS for a column family dynamically altered in hbase-shell?
Gets and Scans only see deleted rows when they use a time range that ends before the resp. delete marker.
The idea is that by using timerange [0,T+1), you can query the system for the state it was in at time T.
If that range includes the delete marker you won't see the deleted rows, if it does not you will.
On 2011-10-19 21:16:42, Prakash Khemani wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java , line 114
> < https://reviews.apache.org/r/2178/diff/12/?file=50953#file50953line114 >
>
> Why only ExplicitColumnTracker? A "delete family marker" that doesn't have any column qualifier shouldn't be passed to any kind of column tracker, right?
During compactions the ScanWildcardColumnTracker is used, always.
For the new "raw" scans I simply do not want to support ExplicitColumnTracker. There's a check that prevents a "raw" scan from adding any columns.
Note that a "raw" scan is not the same as a time range scan/get. Jon G suggested that'd be a good thing to have, and it was easy to add. A "raw" scan sees everything. Including the delete markers, and could be used (for example) to do custom replication which includes delete markers and deleted rows.
ScanWildcardColumnTracker has all the new logic to deal with delete markers and deleted rows.
On 2011-10-19 21:16:42, Prakash Khemani wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 274
> < https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line274 >
>
> Jsut as in checkColumns() should there be an assert for this method also that a delete marker should never call it? A delete family marker doesn't have the column qualifier.
Delete marker should still be subject to this, so this is ok.
On 2011-10-19 21:16:42, Prakash Khemani wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , line 297
> < https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line297 >
>
> If keepDeletedCells is true and retainDeletesInOutput is false then a delete-marker-kv can reach here and fail the assert in checkColumn()?
no, because retainDeletesInOutput is only ever true for: a minor compaction, a memstore flush, and a raw scan. In all cases no columns can be specified.
Lars
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2677
-----------------------------------------------------------
On 2011-10-18 21:43:38, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:43:38)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

Very awesome stuff, Lars. I like the new scan info/enum stuff and lots of good comments. The code in SQM.match() is a bit scary but you've done a nice job at keeping it clean and documenting the heck out of it. I was able to grok it, mostly, in not much time.

It would be good to have some documentation somewhere about how exactly this is used. The raw scanner is pretty sweet but it's not exactly clear to me how i set those params without reading code (and converting from the logic in code back to the configs available is tricky).

I'm +1 (trunk). But maybe wait for someone else who has reviewed this more for the final go ahead.

jiraposter@reviews.apache.org
added a comment - 19/Oct/11 23:01
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2684
-----------------------------------------------------------
Very awesome stuff, Lars. I like the new scan info/enum stuff and lots of good comments. The code in SQM.match() is a bit scary but you've done a nice job at keeping it clean and documenting the heck out of it. I was able to grok it, mostly, in not much time.
It would be good to have some documentation somewhere about how exactly this is used. The raw scanner is pretty sweet but it's not exactly clear to me how i set those params without reading code (and converting from the logic in code back to the configs available is tricky).
I'm +1 (trunk). But maybe wait for someone else who has reviewed this more for the final go ahead.
Jonathan
On 2011-10-18 21:43:38, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:43:38)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 20/Oct/11 04:40
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2694
-----------------------------------------------------------
Thank Lars. Was able to understand the diff after your explanations.
Prakash
On 2011-10-18 21:43:38, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:43:38)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

Lars Hofhansl
added a comment - 20/Oct/11 05:59
Added a section to the book explaining keeping deleted rows.
Clarified wording in minimum number of versions, and also changed the documented constraints (namely that minVersions must be < maxVersions)
Added a simple client side test (that also serves as an example of how one would use a raw scanner from the client side).

jiraposter@reviews.apache.org
added a comment - 02/Nov/11 06:45
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review3009
-----------------------------------------------------------
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
< https://reviews.apache.org/r/2178/#comment6726 >
Hi Lars, Isn't this early-out problematic? It doesn't take into account min-versions. It doesn't take into account the newly introduced keepDeletedCells mode.
Prakash
On 2011-10-18 21:43:38, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:43:38)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars

jiraposter@reviews.apache.org
added a comment - 02/Nov/11 18:45
On 2011-11-02 06:44:54, Prakash Khemani wrote:
> http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java , lines 210-212
> < https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line210 >
>
> Hi Lars, Isn't this early-out problematic? It doesn't take into account min-versions. It doesn't take into account the newly introduced keepDeletedCells mode.
The early out only happens when miversions is not set. Check out *ColumnTracker
Lars
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review3009
-----------------------------------------------------------
On 2011-10-18 21:43:38, Lars Hofhansl wrote:
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/
-----------------------------------------------------------
(Updated 2011-10-18 21:43:38)
Review request for hbase, Ted Yu and Jonathan Gray.
Summary
-------
HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. look at the state of the data at any point in the past, provided the data is still around.
This did not work for deletes, however. Deletes would always mask all puts in the past.
This change adds a flag that can be on HColumnDescriptor to enable retention of deleted rows.
These rows are still subject to TTL and/or VERSIONS.
This changes the following:
1. There is a new flag on HColumnDescriptor enabling that behavior.
2. Allow gets/scans with a timerange to retrieve rows hidden by a delete marker, if the timerange does not include the delete marker.
3. Do not unconditionally collect all deleted rows during a compaction.
4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
The change is small'ish, but the logic is intricate, so please review carefully.
This addresses bug HBASE-4536 .
https://issues.apache.org/jira/browse/HBASE-4536
Diffs
-----
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java PRE-CREATION
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1185362
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1185362
Diff: https://reviews.apache.org/r/2178/diff
Testing
-------
All tests pass now.
Thanks,
Lars