Details

Description

Currently, the ability for a core developer to add per-table & per-CF configuration settings is very heavyweight. You need to add a reserved keyword all the way up the stack & you have to support this variable long-term if you're going to expose it explicitly to the user. This has ended up with using Configuration.get() a lot because it is lightweight and you can tweak settings while you're trying to understand system behavior [since there are many config params that may never need to be tuned]. We need to add the ability to put & read arbitrary KV settings in the HBase schema. Combined with online schema change, this will allow us to safely iterate on configuration settings.

Activity

My current proposal is to create a derived class off HBaseConfiguration called HBaseTableConfiguration. Also, create a HBaseStoreConfiguration derived from HBaseTableConfiguration. This will allow us to specify all the normal Config.get() keys on a per-table, per-CF basis without a major code refactor. Still need to flush out how this would look in the schema, since you probably want to distinguish between reserved keywords and "at your own risk" hidden settings.

For config items that we later identify as important, we can create a reserved keyword and then map it to the old config name. This will allow us to iterate fast & stabilize without needing explicit schema migration.

Nicolas Spiegelberg
added a comment - 04/Feb/12 02:00 My current proposal is to create a derived class off HBaseConfiguration called HBaseTableConfiguration. Also, create a HBaseStoreConfiguration derived from HBaseTableConfiguration. This will allow us to specify all the normal Config.get() keys on a per-table, per-CF basis without a major code refactor. Still need to flush out how this would look in the schema, since you probably want to distinguish between reserved keywords and "at your own risk" hidden settings.
For config items that we later identify as important, we can create a reserved keyword and then map it to the old config name. This will allow us to iterate fast & stabilize without needing explicit schema migration.

Lars Hofhansl
added a comment - 04/Feb/12 04:44 We were just discussing here at SFDC to add a generalized way to serialized Configurations into a HTableDescriptor. What you propose is much cleaner than what we had in mind. So +1
Are you planning some kind of classification theme to distinguish config settings that only make sense globally from those that make sense per CF/Table?
Short term I was planning to add the ability to set arbitrary HTableDescriptor value through the shell, I'll hold of on that.

@Lars: the original idea was to allow users to arbitrarily set KVs in the HTableDescriptor and HColumnDescriptor, but make it so users know that what they're doing is not checked. Need some sort of format to distinguish between reserved keywords and non-reserved (thinking of doing this on the client side). As a config value becomes more well-known, we can enforce limitations like you stated.

I'd rather have this evolve by having a handful of users who want to set a config value, learn over the long-term that this is useful, and incrementally refactor the code to ease support for that config. I don't want to get into a spot where we have to do a large refactor to support this feature & do extensive sanity checking, only to determine that we only need 20% of the config values.

Nicolas Spiegelberg
added a comment - 06/Feb/12 15:40 @Lars: the original idea was to allow users to arbitrarily set KVs in the HTableDescriptor and HColumnDescriptor, but make it so users know that what they're doing is not checked. Need some sort of format to distinguish between reserved keywords and non-reserved (thinking of doing this on the client side). As a config value becomes more well-known, we can enforce limitations like you stated.
I'd rather have this evolve by having a handful of users who want to set a config value, learn over the long-term that this is useful, and incrementally refactor the code to ease support for that config. I don't want to get into a spot where we have to do a large refactor to support this feature & do extensive sanity checking, only to determine that we only need 20% of the config values.

Combined with online schema change, this will allow us to safely iterate on configuration settings.

So, configuration change would come in via online schema change rather than say, via zookeeper callback? The former would be heavyweight compared (closing and reopening regions which seems overkill for say, a change in flush size). Scoping to table and store also makes it so this scheme won't work for configurations that are not table nor store.

stack
added a comment - 08/Feb/12 01:37 Combined with online schema change, this will allow us to safely iterate on configuration settings.
So, configuration change would come in via online schema change rather than say, via zookeeper callback? The former would be heavyweight compared (closing and reopening regions which seems overkill for say, a change in flush size). Scoping to table and store also makes it so this scheme won't work for configurations that are not table nor store.

@stack: that's true, but you'd need to do a large refactor to handle truly online configuration transitions. we'd also need to add in locking primitives for all config params. this is a good long-term goal, but the strategy mentioned here would have almost no downtime and very little code change. Note that, with your region server draining, you can change global config values in an online fashion on a per-server basis by issuing a 'regionserver restart --draining' or something like that.

Nicolas Spiegelberg
added a comment - 08/Feb/12 03:41 @stack: that's true, but you'd need to do a large refactor to handle truly online configuration transitions. we'd also need to add in locking primitives for all config params. this is a good long-term goal, but the strategy mentioned here would have almost no downtime and very little code change. Note that, with your region server draining, you can change global config values in an online fashion on a per-server basis by issuing a 'regionserver restart --draining' or something like that.

@stack: have you done a lot of testing with killing ZKQuorumPeers on clusters with a lot of regions? We've good perf with a large number of ZK nodes during normal operation, but bad latency spikes when a peer restarts and the other peers have to update the restarted peer's state.

Nicolas Spiegelberg
added a comment - 08/Feb/12 03:43 @stack: have you done a lot of testing with killing ZKQuorumPeers on clusters with a lot of regions? We've good perf with a large number of ZK nodes during normal operation, but bad latency spikes when a peer restarts and the other peers have to update the restarted peer's state.

bq ...but you'd need to do a large refactor to handle truly online configuration transitions.

What if we just did the 'important' ones?

On the locking primitives, if a long or something, we could just have the config. volatile? Would be more costly than a final long but could make a local copy per method invocation I suppose.

Note that, with your region server draining, you can change global config values in an online fashion on a per-server basis by issuing a 'regionserver restart --draining' or something like that.

We can do that now, right? Make the change then do our rolling restart with graceful shedding of regions from the server, and then graceful replacement.

The addition of the HBaseTableConfiguration and HBaseStoreConfiguration would make it all a little nicer. I'm interested in how the HBaseStoreConfiguration configs make it into the schema and then get undone on the other side (You could ask me how a config. change in a zk callback makes it up into a regionserver Configuration instance... not sure)

...have you done a lot of testing with killing ZKQuorumPeers on clusters with a lot of regions

Haven't done any. Sounds like we should? Does the ensemble have to have a bunch of znodes afloat for you to see the spikes or is it just catching up the restarted peers state regardless of the particular znode loading at the time?

stack
added a comment - 08/Feb/12 04:33 bq ...but you'd need to do a large refactor to handle truly online configuration transitions.
What if we just did the 'important' ones?
On the locking primitives, if a long or something, we could just have the config. volatile? Would be more costly than a final long but could make a local copy per method invocation I suppose.
Note that, with your region server draining, you can change global config values in an online fashion on a per-server basis by issuing a 'regionserver restart --draining' or something like that.
We can do that now, right? Make the change then do our rolling restart with graceful shedding of regions from the server, and then graceful replacement.
The addition of the HBaseTableConfiguration and HBaseStoreConfiguration would make it all a little nicer. I'm interested in how the HBaseStoreConfiguration configs make it into the schema and then get undone on the other side (You could ask me how a config. change in a zk callback makes it up into a regionserver Configuration instance... not sure)
...have you done a lot of testing with killing ZKQuorumPeers on clusters with a lot of regions
Haven't done any. Sounds like we should? Does the ensemble have to have a bunch of znodes afloat for you to see the spikes or is it just catching up the restarted peers state regardless of the particular znode loading at the time?

@stack: I understand why some users might want this capability for 'important' toggles that they make frequently. I think that should be discussed as a separate JIRA. At FB (and probably other production companies), we have a chicken & egg problem where we don't know which seemingly-obscure values might be useful because we don't have an easy way to toggle them online (hence the JIRA). Most of our use cases, we'd alter the table maybe every month or two with a new, obscure config value that we want to change, so online schema change is acceptable since we incur minimal latency and no downtime.

Basically, our ZK use case was a UserID->ClusterID mapping for every FB user (large # of ZNodes). This had horrible scaling problems, even when we employed batching. I'm not a great point of contact for this, but I can get you in contact with one. As a rule of thumb, we've been cautious about adding too many ZNodes in the same way that you'd have an HBase admin be cautious about adding too many regions.

Nicolas Spiegelberg
added a comment - 08/Feb/12 05:02 @stack: I understand why some users might want this capability for 'important' toggles that they make frequently. I think that should be discussed as a separate JIRA. At FB (and probably other production companies), we have a chicken & egg problem where we don't know which seemingly-obscure values might be useful because we don't have an easy way to toggle them online (hence the JIRA). Most of our use cases, we'd alter the table maybe every month or two with a new, obscure config value that we want to change, so online schema change is acceptable since we incur minimal latency and no downtime.
Basically, our ZK use case was a UserID->ClusterID mapping for every FB user (large # of ZNodes). This had horrible scaling problems, even when we employed batching. I'm not a great point of contact for this, but I can get you in contact with one. As a rule of thumb, we've been cautious about adding too many ZNodes in the same way that you'd have an HBase admin be cautious about adding too many regions.

Any objections to changing the HTableDescriptor & HColumnDescriptor maps from ImmutableBytesWriteable to String? String itself is immutable and a subset of the other class. It looks like we always use the map to store Strings right now. This would require a version bump.

Nicolas Spiegelberg
added a comment - 22/Feb/12 22:34 Any objections to changing the HTableDescriptor & HColumnDescriptor maps from ImmutableBytesWriteable to String? String itself is immutable and a subset of the other class. It looks like we always use the map to store Strings right now. This would require a version bump.

nspiegelberg has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

This is a first draft to begin the review process. Have manually verified that the code works, but no unit tests have been made yet. Also, need to figure out a way to delete a config value from the shell once added in.

Phabricator
added a comment - 10/Mar/12 01:58 nspiegelberg has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
This is a first draft to begin the review process. Have manually verified that the code works, but no unit tests have been made yet. Also, need to figure out a way to delete a config value from the shell once added in.
REVISION DETAIL
https://reviews.facebook.net/D2247

tedyu has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 HColumnDescriptor is used by client package.
Would this introduce extra dependency ?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:106 wrap long line, please.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:531 Can this method be unified with HColumnDescriptor.getKVs() ?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:452 regionInfo.getTableDesc() is an expensive method, in 0.92 and above.
Is there a way to avoid calling it ?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:445 This ctor seems to be in 0.89-fb only.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:1 Add license, please.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:39 Add javadoc for the methods.
e.g. for getRaw():

Get the value of the <code>key</code> property, without doing

variable expansion.

src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:81 This can be removed, right ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:57 Minor:
If a return is added after this line, there is no need to introduce else statement, saving some indentation.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:160 What if a higher-priority ImmutableConfigMap in this.configs returns null but a lower-priority ImmutableConfigMap returns non-null value ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:203 You mean bug in Configuration.java

Phabricator
added a comment - 10/Mar/12 17:06 tedyu has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 HColumnDescriptor is used by client package.
Would this introduce extra dependency ?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:106 wrap long line, please.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:531 Can this method be unified with HColumnDescriptor.getKVs() ?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:452 regionInfo.getTableDesc() is an expensive method, in 0.92 and above.
Is there a way to avoid calling it ?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:445 This ctor seems to be in 0.89-fb only.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:1 Add license, please.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:39 Add javadoc for the methods.
e.g. for getRaw():
Get the value of the <code>key</code> property, without doing
variable expansion.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:81 This can be removed, right ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:57 Minor:
If a return is added after this line, there is no need to introduce else statement, saving some indentation.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:160 What if a higher-priority ImmutableConfigMap in this.configs returns null but a lower-priority ImmutableConfigMap returns non-null value ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:203 You mean bug in Configuration.java
REVISION DETAIL
https://reviews.facebook.net/D2247

tedyu has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:160 For Configuration, the above shouldn't happen.
Disregard.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:369 We may need this method later.
src/main/ruby/hbase/admin.rb:174 TRUNK version handles htd.setOwnerString() as well.
src/main/ruby/hbase/admin.rb:177 The error message doesn't seem to match condition.
src/main/ruby/hbase/admin.rb:346 A check similar to the one on line 188 is desirable here.
src/main/ruby/hbase/admin.rb:451 A check similar to the one on line 188 is desirable here.

Phabricator
added a comment - 10/Mar/12 17:26 tedyu has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:160 For Configuration, the above shouldn't happen.
Disregard.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:369 We may need this method later.
src/main/ruby/hbase/admin.rb:174 TRUNK version handles htd.setOwnerString() as well.
src/main/ruby/hbase/admin.rb:177 The error message doesn't seem to match condition.
src/main/ruby/hbase/admin.rb:346 A check similar to the one on line 188 is desirable here.
src/main/ruby/hbase/admin.rb:451 A check similar to the one on line 188 is desirable here.
REVISION DETAIL
https://reviews.facebook.net/D2247

nspiegelberg has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 I think this would introduce a Guava dependency. I just wanted something with concise set notation for the rough draft. I'll try to use standard Java APIs for a future iteration since this is client-side.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:531 There is some special logic in HTD around ROOT & META regions that make unification nontrivial.
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:452 thanks for the info. I'll remember to look into 92 implementation details of this when I do a trunk port.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:39 note that this is a private interface, so it wouldn't appear in javadoc. I should add some notes here though.
src/main/ruby/hbase/admin.rb:177 you only hit this error if 1 was specified but the other wasn't.

Phabricator
added a comment - 11/Mar/12 16:23 nspiegelberg has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 I think this would introduce a Guava dependency. I just wanted something with concise set notation for the rough draft. I'll try to use standard Java APIs for a future iteration since this is client-side.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:531 There is some special logic in HTD around ROOT & META regions that make unification nontrivial.
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:452 thanks for the info. I'll remember to look into 92 implementation details of this when I do a trunk port.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:39 note that this is a private interface, so it wouldn't appear in javadoc. I should add some notes here though.
src/main/ruby/hbase/admin.rb:177 you only hit this error if 1 was specified but the other wasn't.
REVISION DETAIL
https://reviews.facebook.net/D2247

stack has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

Looks good. Minor comments only.

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:736 kvs is a loaded term in hbase code base. When I see it I think KeyValue, the class. Mayhaps change this method name?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:571 As Ted suggests, some of the body of this method could be broken out into a common method rather than dup code. For example, from here to the end of the loop seems common to both.

They both inherit WritableComparable. Could inherit a more specialized type, one w/ support for this and other commonage.

No biggie.
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:938 Why change name? You are returning the 'Configuration'. In general, why change the name to baseConf from conf when its just a Configuation still?
src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java:41 This is better I'd say.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:48 lol

This a warning? There be dragons...
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:60 Isn't hadoop Configuration kinda wonky where there is the notion of finals and these are supposed to be at the 'front'. Does this 'front' go before the hadoop final 'front'? It looks like it does which is what we want I think.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:113 In the rest of the code, we add curlies or else put if and the one line clause both on same line. FYI.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:428 Why not do this stuff in HBaseConfiguration rather than add new method?

Phabricator
added a comment - 13/Mar/12 04:41 stack has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
Looks good. Minor comments only.
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:736 kvs is a loaded term in hbase code base. When I see it I think KeyValue, the class. Mayhaps change this method name?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:571 As Ted suggests, some of the body of this method could be broken out into a common method rather than dup code. For example, from here to the end of the loop seems common to both.
They both inherit WritableComparable. Could inherit a more specialized type, one w/ support for this and other commonage.
No biggie.
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:938 Why change name? You are returning the 'Configuration'. In general, why change the name to baseConf from conf when its just a Configuation still?
src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java:41 This is better I'd say.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:48 lol
This a warning? There be dragons...
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:60 Isn't hadoop Configuration kinda wonky where there is the notion of finals and these are supposed to be at the 'front'. Does this 'front' go before the hadoop final 'front'? It looks like it does which is what we want I think.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:113 In the rest of the code, we add curlies or else put if and the one line clause both on same line. FYI.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:428 Why not do this stuff in HBaseConfiguration rather than add new method?
REVISION DETAIL
https://reviews.facebook.net/D2247

when you split, you take the 'conf' from the parent region and pass it into the daughter region's constructor. If you passed in the CompoundConfiguration, you would end up with using the HTD of the parent region and the new HTD of the daughter region. You really need to pass the base Configuration object used by HRegionServer to the daughter regions to avoid a tricky dedupe problem.

Phabricator
added a comment - 13/Mar/12 21:28 nspiegelberg has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:736 okay. maybe "getValues" since that's the variable name?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:571 yeah, I think that HTD & HCD needs a lot of unification work beyond this.
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:938 I'll write more comments, there's a nasty issue here:
when you split, you take the 'conf' from the parent region and pass it into the daughter region's constructor. If you passed in the CompoundConfiguration, you would end up with using the HTD of the parent region and the new HTD of the daughter region. You really need to pass the base Configuration object used by HRegionServer to the daughter regions to avoid a tricky dedupe problem.
REVISION DETAIL
https://reviews.facebook.net/D2247

tedyu has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:65 This seems to be the only place com.google.common.collect.Lists is used.
Can we instantiate ArrayList directly ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:120 This can be brought onto line 119.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:143 Should getRaw() be called here ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:153 This can be removed.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:526 Can this be unified with HColumnDescriptor.getValues() ?

Phabricator
added a comment - 16/Mar/12 22:08 tedyu has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:65 This seems to be the only place com.google.common.collect.Lists is used.
Can we instantiate ArrayList directly ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:120 This can be brought onto line 119.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:143 Should getRaw() be called here ?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:153 This can be removed.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:526 Can this be unified with HColumnDescriptor.getValues() ?
REVISION DETAIL
https://reviews.facebook.net/D2247

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:776 This looks nice in the output?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:582 Seems like some of this method could be unified w/ the HCD#getValues (as per Ted comment)
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:947 What is the dedupe problem? Dedup of HTD properties?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:938 Does this have to be public? Only used inside this package?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:1 Configuration in hadoop is in the conf package. Should we have one in hbase. We could put this there. Or, earlier I suggest it be an inner class of HBaseConfiguration or just a feature of HBC... you ignored that comment 'cos it silly?
src/main/ruby/hbase/admin.rb:187 How does this work in shell? I can't picture it looking at code. Looks like user says ADVANCED and then passes k/vs? If so, how you think we going to explain notion of ADVANCED?
src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java:169 Hmm... above I suggest this be a package private method but see here it needs to be public... at least for this test.

Phabricator
added a comment - 16/Mar/12 23:08 stack has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
A few minors. Looks good. Will need fat release note when committed. Lars, might go into 0.94?
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:776 This looks nice in the output?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:582 Seems like some of this method could be unified w/ the HCD#getValues (as per Ted comment)
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:947 What is the dedupe problem? Dedup of HTD properties?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:938 Does this have to be public? Only used inside this package?
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:1 Configuration in hadoop is in the conf package. Should we have one in hbase. We could put this there. Or, earlier I suggest it be an inner class of HBaseConfiguration or just a feature of HBC... you ignored that comment 'cos it silly?
src/main/ruby/hbase/admin.rb:187 How does this work in shell? I can't picture it looking at code. Looks like user says ADVANCED and then passes k/vs? If so, how you think we going to explain notion of ADVANCED?
src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java:169 Hmm... above I suggest this be a package private method but see here it needs to be public... at least for this test.
REVISION DETAIL
https://reviews.facebook.net/D2247

We need to dedupe the HTD.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:1 I guess the question is: do we want to refactor the existing API & both all conf classes under a conf directory? I put this under util because I didn't want to clutter the main folder with a class that was only supposed to be used internally. I considered putting it under the regionserver folder. Maybe that's a better fit?
src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java:169 yeah. another option is making a copy constructor
src/main/ruby/hbase/admin.rb:187 Basically, your schema would look like:

NAME => 'blah', BLOOMFILTER => ROWCOL,
ADVANCED =>

{"hbase.hstore.compaction.ratio" => "0.25"}

I don't want to explain the notion of "ADVANCED" too much beyond HBase committers. Basically, it's only a toggle for people who know what they're doing and aren't afraid to be power users and look at code. If we get really common config patterns, we should pull them out to reserved keywords for common users and then map. For example:

COMPACT_RATIO => 'hbase.hstore.compaction.ratio'

Why make a config option that most people won't play with? Because , as power users, we can iterate on functionality & help users. There is now a workaround for a specific user's problem without modifying code and we don't have to have advanced deprecation strategies like we would with a reserved keyword.

Phabricator
added a comment - 19/Mar/12 15:58 nspiegelberg has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:947 Important notes:
1. CompoundConfiguration is a derived class from Configuration
2. there is a CompoundConfiguration.add(Configuration) function
HRegion.conf = CompoundConfiguration(BaseConf, HTD)
if you passed HRegion.conf to the daughter region constructor on a split, the daughter region would have:
HRegion.conf
= CompoundConfiguration(CompoundConfiguration(BaseConf, HTD), HTD)
= CompoundConfiguration(BaseConf, HTD, HTD)
We need to dedupe the HTD.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:1 I guess the question is: do we want to refactor the existing API & both all conf classes under a conf directory? I put this under util because I didn't want to clutter the main folder with a class that was only supposed to be used internally. I considered putting it under the regionserver folder. Maybe that's a better fit?
src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java:169 yeah. another option is making a copy constructor
src/main/ruby/hbase/admin.rb:187 Basically, your schema would look like:
NAME => 'blah', BLOOMFILTER => ROWCOL,
ADVANCED =>
{"hbase.hstore.compaction.ratio" => "0.25"}
I don't want to explain the notion of "ADVANCED" too much beyond HBase committers. Basically, it's only a toggle for people who know what they're doing and aren't afraid to be power users and look at code. If we get really common config patterns, we should pull them out to reserved keywords for common users and then map. For example:
COMPACT_RATIO => 'hbase.hstore.compaction.ratio'
Why make a config option that most people won't play with? Because , as power users, we can iterate on functionality & help users. There is now a workaround for a specific user's problem without modifying code and we don't have to have advanced deprecation strategies like we would with a reserved keyword.
REVISION DETAIL
https://reviews.facebook.net/D2247

Note: alterations to the CF schema can be made online. Currently, alterations to the table-level schema requires a disable-modify-enable. This is related to online schema design but should be trivial to alter.

Nicolas Spiegelberg
added a comment - 19/Mar/12 20:51 Note: alterations to the CF schema can be made online. Currently, alterations to the table-level schema requires a disable-modify-enable. This is related to online schema design but should be trivial to alter.

mbautin has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

A few initial comments (unfortunately on an earlier version of the revision).

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:748 Use containsKey
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 Are we trying to make the output jruby-parseable? If so, we need to take care of escaping embedded single quotes here.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 HConstants.ADVANCED sounds confusing. I think we need that constant to have a more descriptive name.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:773-775 See my earlier comment about escaping embedded single quotes. The escaping method also needs to be shared between all callsites.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 Is it possible to create the unmodifiable map once instead of every time this is called?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 I might have missed some earlier discussion, but why exactly is a Guava dependency a bad thing? Guava is licensed under Apache License 2.0 according to http://code.google.com/p/guava-libraries/.

Phabricator
added a comment - 19/Mar/12 21:17 mbautin has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
A few initial comments (unfortunately on an earlier version of the revision).
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:748 Use containsKey
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 Are we trying to make the output jruby-parseable? If so, we need to take care of escaping embedded single quotes here.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 HConstants.ADVANCED sounds confusing. I think we need that constant to have a more descriptive name.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:773-775 See my earlier comment about escaping embedded single quotes. The escaping method also needs to be shared between all callsites.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 Is it possible to create the unmodifiable map once instead of every time this is called?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 I might have missed some earlier discussion, but why exactly is a Guava dependency a bad thing? Guava is licensed under Apache License 2.0 according to http://code.google.com/p/guava-libraries/ .
REVISION DETAIL
https://reviews.facebook.net/D2247

nspiegelberg has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 actually, the purpose was to dissuade people from using this unless they know what they're doing. we don't want people randomly putting keys in here without looking at the source code and then wondering why it doesn't work. make sense? do you have another suggestion?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 we don't currently have any config variables with single quotes, do we? it's much more developer-controlled than user input. I guess I should do some basic sanitization for the user error case. the important ability is that the user can mistakenly enter a key with a single quote and then delete it
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 this is a shallow pointer copy, not a deep KV copy
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 @mbautin: ted's point is that this will be in client code, so every app server & MapReduce cluster would need to have the Guava dependency installed versus just the HBase server deployment.

Phabricator
added a comment - 19/Mar/12 21:49 nspiegelberg has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 actually, the purpose was to dissuade people from using this unless they know what they're doing. we don't want people randomly putting keys in here without looking at the source code and then wondering why it doesn't work. make sense? do you have another suggestion?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 we don't currently have any config variables with single quotes, do we? it's much more developer-controlled than user input. I guess I should do some basic sanitization for the user error case. the important ability is that the user can mistakenly enter a key with a single quote and then delete it
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 this is a shallow pointer copy, not a deep KV copy
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 @mbautin: ted's point is that this will be in client code, so every app server & MapReduce cluster would need to have the Guava dependency installed versus just the HBase server deployment.
REVISION DETAIL
https://reviews.facebook.net/D2247

mbautin has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

Nicolas: Looks good! Some more comments inline.

Also, we now have lint enabled. Could you please re-run "mvn -Darc initialize", then do "arc lint" or "arc diff --preview" and address lint comments? Thanks!

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HConstants.java:361 This constant needs a better name. Probably some of the constants above it do, too.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:530 Is it possible to reduce code duplication between here and HColumnDescriptor?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:549 single quote escaping
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:579-581 single quote escaping
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 Looks similar to the corresponding function in HColumnDescriptor. Is it possible to reuse code between the two? (getValues() would be a bigger case for that.)
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:42 Add <p/>
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 Why exactly do we have to copy-and-paste the code instead of using composition or inheritance?

If there is a bug in Configuration and it is fixed there, it will not be reflected here, which is indeed somewhat "tragic".
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 It would be great to avoid adding more tests to TestFromClientSide and create a separate test class instead.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4247 This probably should not use javadoc syntax.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4312 Is it possible to wait for an event instead of a specific amount of time?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 In that case we should probably sanitize user input for single quotes somewhere else, wherever the user is supposed to tweak configuration values in real time. However, I think it is easier to escape single quotes here. It would also be good to parse the output value of this function with jruby in a test.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4322 Create an HConstant for this key
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 Sleep times should be according to http://hbase.apache.org/book.html#hbase.tests.writing . We should probably have a constant for each type of sleep time. Quoting the linked page from HBase book:

> Whenever possible, tests should not use Thread.sleep, but rather waiting for the real event they need. This is faster and clearer for the reader. Tests should not do a Thread.sleep without testing an ending condition. This allows understanding what the test is waiting for. Moreover, the test will work whatever the machine performance is. Sleep should be minimal to be as fast as possible. Waiting for a variable should be done in a 40ms sleep loop. Waiting for a socket operation should be done in a 200 ms sleep loop.
src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java:33 Can this be private?
src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java:43-46 Is this override necessary?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 I guess we need a javadoc saying that ADVANCED is not for general-purpose use, and that it is meant for use as an HColumnDescriptor key, then.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 OK, if you think we won't be calling this a lot, this is fine with me.

Phabricator
added a comment - 19/Mar/12 22:01 mbautin has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
Nicolas: Looks good! Some more comments inline.
Also, we now have lint enabled. Could you please re-run "mvn -Darc initialize", then do "arc lint" or "arc diff --preview" and address lint comments? Thanks!
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HConstants.java:361 This constant needs a better name. Probably some of the constants above it do, too.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:530 Is it possible to reduce code duplication between here and HColumnDescriptor?
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:549 single quote escaping
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:579-581 single quote escaping
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 Looks similar to the corresponding function in HColumnDescriptor. Is it possible to reuse code between the two? (getValues() would be a bigger case for that.)
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:42 Add <p/>
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 Why exactly do we have to copy-and-paste the code instead of using composition or inheritance?
If there is a bug in Configuration and it is fixed there, it will not be reflected here, which is indeed somewhat "tragic".
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 It would be great to avoid adding more tests to TestFromClientSide and create a separate test class instead.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4247 This probably should not use javadoc syntax.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4312 Is it possible to wait for an event instead of a specific amount of time?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 In that case we should probably sanitize user input for single quotes somewhere else, wherever the user is supposed to tweak configuration values in real time. However, I think it is easier to escape single quotes here. It would also be good to parse the output value of this function with jruby in a test.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4322 Create an HConstant for this key
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 Sleep times should be according to http://hbase.apache.org/book.html#hbase.tests.writing . We should probably have a constant for each type of sleep time. Quoting the linked page from HBase book:
> Whenever possible, tests should not use Thread.sleep, but rather waiting for the real event they need. This is faster and clearer for the reader. Tests should not do a Thread.sleep without testing an ending condition. This allows understanding what the test is waiting for. Moreover, the test will work whatever the machine performance is. Sleep should be minimal to be as fast as possible. Waiting for a variable should be done in a 40ms sleep loop. Waiting for a socket operation should be done in a 200 ms sleep loop.
src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java:33 Can this be private?
src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java:43-46 Is this override necessary?
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 I guess we need a javadoc saying that ADVANCED is not for general-purpose use, and that it is meant for use as an HColumnDescriptor key, then.
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 OK, if you think we won't be calling this a lot, this is fine with me.
REVISION DETAIL
https://reviews.facebook.net/D2247

nspiegelberg has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 I looked into this some more. We would need to escape more than just single quotes. There is a function StringEscapeUtils.escapeJavaScript() in apache.commons.lang that uses the same parsing escapes as Ruby. However, that requires us to either ensure client packaging of this dependency or write a custom parser for this.

practically, single quotes are a mistake. we offer zero parsing sanitization as is and just let the RegionServer throw an IllegalArgumentException if the user configures it wrong. we need to let the user undo a single quote mistake. this is currently possible. value sanitization is the same as existing functionality
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 currently, there is no javadoc info & the user has no way of knowing he can do this unless he knows about it from the JIRA. security through obscurity.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 Already talked with Ted & Stack about this in previous comments. A summary:

Both HTD & HCD have massive redundancies that could be unified beyond this function. They are both basically decorated K,V stores. I don't think unification is very necessary until we have something like locality groups and need a third K,V store. Have you traditionally seen bugs related to HTD & HCD inconsistencies? I'm just worried that there is time spent on thinking of how to refactor this code without thinking about whether the denormalized design is actually hurting us on a practical feature design/debug sense.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 3 problems:

1) If I don't use Configuration.java, then I need to refactor a LOT of the regionserver code to use the new API

2) we basically need to override Configuration.getProps or we'd need protected access to Configuration.properties so we can modify that pointer. Basically, there are a bunch of functions in the base Configuration that call getProps() and we need to use our derived version instead of the base version.

3) Trying to make a generic implementation that works across all HDFS versions. I would like to modify Configuration.properties in HDFS 1.0 to be protected, but I'd still need to have this code until my patch made it to all the versions we support.

Configuration.java hasn't changed much, so I don't think this is an issue in practice. Do you have another strategy?
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 The 10 sec sleep is found from TestFromClientSide.compactCFRegion. I used the same paradigm here and added poll-waiting to speed up the test. We need the synchronous compaction feature before we can remove this (HBASE-2949).

Phabricator
added a comment - 19/Mar/12 22:33 nspiegelberg has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 I looked into this some more. We would need to escape more than just single quotes. There is a function StringEscapeUtils.escapeJavaScript() in apache.commons.lang that uses the same parsing escapes as Ruby. However, that requires us to either ensure client packaging of this dependency or write a custom parser for this.
practically, single quotes are a mistake. we offer zero parsing sanitization as is and just let the RegionServer throw an IllegalArgumentException if the user configures it wrong. we need to let the user undo a single quote mistake. this is currently possible. value sanitization is the same as existing functionality
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 currently, there is no javadoc info & the user has no way of knowing he can do this unless he knows about it from the JIRA. security through obscurity.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 Already talked with Ted & Stack about this in previous comments. A summary:
Both HTD & HCD have massive redundancies that could be unified beyond this function. They are both basically decorated K,V stores. I don't think unification is very necessary until we have something like locality groups and need a third K,V store. Have you traditionally seen bugs related to HTD & HCD inconsistencies? I'm just worried that there is time spent on thinking of how to refactor this code without thinking about whether the denormalized design is actually hurting us on a practical feature design/debug sense.
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 3 problems:
1) If I don't use Configuration.java, then I need to refactor a LOT of the regionserver code to use the new API
2) we basically need to override Configuration.getProps or we'd need protected access to Configuration.properties so we can modify that pointer. Basically, there are a bunch of functions in the base Configuration that call getProps() and we need to use our derived version instead of the base version.
3) Trying to make a generic implementation that works across all HDFS versions. I would like to modify Configuration.properties in HDFS 1.0 to be protected, but I'd still need to have this code until my patch made it to all the versions we support.
Configuration.java hasn't changed much, so I don't think this is an issue in practice. Do you have another strategy?
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 The 10 sec sleep is found from TestFromClientSide.compactCFRegion. I used the same paradigm here and added poll-waiting to speed up the test. We need the synchronous compaction feature before we can remove this ( HBASE-2949 ).
REVISION DETAIL
https://reviews.facebook.net/D2247

mbautin has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 OK, got it. Please note in the comment that these methods are private in some HDFS versions HBase has to support.
src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 OK, sounds good.

nspiegelberg has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

INLINE COMMENTS
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 we need to chop this file into multiple pieces for parallelization, separate from this JIRA, correct? testFromClientSide seems to be the correct spot since it's an HBase java client integration test.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4326 will make an HConstant

mbautin has commented on the revision "[jira]HBASE-5335 Dynamic Schema Config".

A couple more responses inline.

INLINE COMMENTS
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 This would be the right place for a new test like this if it were not as huge as it is now. Would it be difficult to add new unit tests into separate test classes? I have tried to do so when creating new tests and have not had much difficulties.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 It would be great to add this explanation to the comment and/or make this delay a meaningfully-named constant and reuse it in both places.

Phabricator
added a comment - 19/Mar/12 22:55 mbautin has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
A couple more responses inline.
INLINE COMMENTS
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 This would be the right place for a new test like this if it were not as huge as it is now. Would it be difficult to add new unit tests into separate test classes? I have tried to do so when creating new tests and have not had much difficulties.
src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 It would be great to add this explanation to the comment and/or make this delay a meaningfully-named constant and reuse it in both places.
REVISION DETAIL
https://reviews.facebook.net/D2247

Phabricator
added a comment - 20/Mar/12 21:21 mbautin has commented on the revision " [jira] HBASE-5335 Dynamic Schema Config".
Nicolas: could you please re-submit this with lint results? You need to re-run "mvn -Darc initialize", and then you can use "arc diff --preview" instead of "arc diff --only" to include lint results.
REVISION DETAIL
https://reviews.facebook.net/D2247

1) coprocessors already use the generic KV api to store their info in HTableDescriptor. would be nice for someone with coprocessor experience to view it.
2) Because of #1, may want to consider making a separate config map for this JIRA. this would require changing the persistent data format though We could use the keyword 'CONFIG' instead of 'ADVANCED'?
3) There is currently a bug in the trunk version of online schema change where HBaseAdmin.getAlterStatus() is not synchronous. This is causing the new TestFromClientSide3 to fail. Problem does not exist in 89fb

Nicolas Spiegelberg
added a comment - 22/Mar/12 14:38 Working on a port for trunk. 3 items:
1) coprocessors already use the generic KV api to store their info in HTableDescriptor. would be nice for someone with coprocessor experience to view it.
2) Because of #1, may want to consider making a separate config map for this JIRA. this would require changing the persistent data format though We could use the keyword 'CONFIG' instead of 'ADVANCED'?
3) There is currently a bug in the trunk version of online schema change where HBaseAdmin.getAlterStatus() is not synchronous. This is causing the new TestFromClientSide3 to fail. Problem does not exist in 89fb

Initial patch for trunk. I think I should change 'ADVANCED' to config and a couple other minor things, so please don't commit. This is for hadoop QA plus for you to fiddle with. Note that TestFromClientSide3 should fail until HBASE-5359 is committed. I made a preliminary patch for HBASE-5359 and verified that it fixed the test.

Nicolas Spiegelberg
added a comment - 27/Mar/12 03:01 Initial patch for trunk. I think I should change 'ADVANCED' to config and a couple other minor things, so please don't commit. This is for hadoop QA plus for you to fiddle with. Note that TestFromClientSide3 should fail until HBASE-5359 is committed. I made a preliminary patch for HBASE-5359 and verified that it fixed the test.

Hadoop QA
added a comment - 27/Mar/12 04:44 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12520049/HBASE-5335-trunk.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 20 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1317//console
This message is automatically generated.

stack
added a comment - 02/Apr/12 23:46 Can you make a patch for trunk Nicolas. Lets get this in.....
Should you doc this new behavior:
+ if (value == null ) {
+ remove(Bytes.toBytes(key));
Should this be hasReservedKeywords rather than hasAdvancedKeys?
+ if (!RESERVED_KEYWORDS.contains(k)) {
+ hasAdvancedKeys = true ;
Yeah, whats ADVANCED vs RESERVER_KEYWORDS?

1) Use the keyword CONFIG instead of ADVANCED. This should be a little more straightforward to understand.
2) Move CompoundConfiguration into the regionserver namespace and make it a package private class with a private interface audience. This will prevent other people from using this transitory class and avoid documentation.

This patch will fail on TestFromClientSide3 until HBASE-5359 is committed.

Nicolas Spiegelberg
added a comment - 04/Apr/12 22:05 Version 2 of the trunk patch. Major changes:
1) Use the keyword CONFIG instead of ADVANCED. This should be a little more straightforward to understand.
2) Move CompoundConfiguration into the regionserver namespace and make it a package private class with a private interface audience. This will prevent other people from using this transitory class and avoid documentation.
This patch will fail on TestFromClientSide3 until HBASE-5359 is committed.

Hadoop QA
added a comment - 04/Apr/12 23:50 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12521404/D2247.8.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 7 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1396//console
This message is automatically generated.

Status update
1. Fixing the 3 unit test failures, caused by new split code between 89fb -> trunk.
2. Fixed some ruby display/parsing problems found during manual testing. We need a way to unit test the Ruby shell code easily.
3. Taking the time to fix an issue that constantly annoys me: you should be able to run 'describe TABLE', copy the output, and paste it directly into 'create' or 'alter'

Nicolas Spiegelberg
added a comment - 06/Apr/12 20:27 Status update
1. Fixing the 3 unit test failures, caused by new split code between 89fb -> trunk.
2. Fixed some ruby display/parsing problems found during manual testing. We need a way to unit test the Ruby shell code easily.
3. Taking the time to fix an issue that constantly annoys me: you should be able to run 'describe TABLE', copy the output, and paste it directly into 'create' or 'alter'

Hadoop QA
added a comment - 06/Apr/12 20:52 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12521717/HBASE-5335-trunk-3.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 20 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1435//console
This message is automatically generated.

Hadoop QA
added a comment - 06/Apr/12 21:07 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12521724/HBASE-5335-trunk-3.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 20 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1436//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1436//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1436//console
This message is automatically generated.

Hadoop QA
added a comment - 07/Apr/12 00:31 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12521768/HBASE-5335-trunk-4.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 20 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in .
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1443//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1443//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1443//console
This message is automatically generated.

stack
added a comment - 08/Apr/12 00:59 Did a quick scan. +1 on commit (Its hard to look at this patch w/ fresh eyes. I've looked at it a good few times already). Tests look great. Good stuff Nicolas. Lars, you want this for 0.94?