Details

Description

The work on branch-20-append was to support sync, for durable HBase WALs, not append. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list like this. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do).

For v1.x let's:

Always enable the sync path (currently only enabled if dfs.support.append is set)

Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

For 2.x let's

Always enable the hsync/hflush path

The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default.

There is a little problem that sync in 1.x and hflush in 2.x are different. They different methods and slightly different semantic. Do we really need this new option? How about change dfs.support.append for only append. Sync/hflush is always enabled.

Tsz Wo Nicholas Sze
added a comment - 29/Mar/12 17:38 > Let's add a new dfs.support.hsync option ...
There is a little problem that sync in 1.x and hflush in 2.x are different. They different methods and slightly different semantic. Do we really need this new option? How about change dfs.support.append for only append. Sync/hflush is always enabled.

Make hsync/hflush behavior independent of whether dfs.support.appends is enabled, ie you can turn append off and hsync/hflush still work

Note this does not add a dfs.support.sync option to 2.x This would be useful to people who want to disable both append and sync, eg to make a couple paths cheaper. I think we should not add this option unless it's requested / needed in the future. Note that this code path is already enabled by default, so the change is just that people can currently disable all sync/append code paths by disabling append, now they would just be disabling the append-specific code paths.

Eli Collins
added a comment - 29/Mar/12 17:45 For 1.x:
Add a dfs.support.sync option and enable it by default
For 2.x:
Make hsync/hflush behavior independent of whether dfs.support.appends is enabled, ie you can turn append off and hsync/hflush still work
Note this does not add a dfs.support.sync option to 2.x This would be useful to people who want to disable both append and sync, eg to make a couple paths cheaper. I think we should not add this option unless it's requested / needed in the future. Note that this code path is already enabled by default, so the change is just that people can currently disable all sync/append code paths by disabling append, now they would just be disabling the append-specific code paths.

Forgot to mention, given that we know there are data loss issues with append, why don't we remove the ability to enable it at the same time? It's an incompatible change, however there's also no reason in letting users shoot themselves in the foot.

Eli Collins
added a comment - 29/Mar/12 18:13 Forgot to mention, given that we know there are data loss issues with append, why don't we remove the ability to enable it at the same time? It's an incompatible change, however there's also no reason in letting users shoot themselves in the foot.

Eli Collins
added a comment - 29/Mar/12 23:03 Latest proposal:
For 1.x:
Always enable the sync path (currently only enabled if dfs.support.append is set)
Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back
For 2.x:
Always enable the hsync/hflush path
The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default
Sound good?

Eli Collins
added a comment - 30/Mar/12 00:47 Thanks guys. I filed a separate jira ( HADOOP-8230 ) for the 1.x change so I can clearly separate it from the 2.x change, and eg indicate that it is an incompatible change.
I've filed HBASE-5676 to share the good news.

Hadoop QA
added a comment - 30/Mar/12 18:21 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12520627/hdfs-3120.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 51 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hdfs.TestFileAppend4
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//console
This message is automatically generated.

Yea, it passes for me locally, given that the code paths don't change - hscync/flush and append are already enabled by default - shouldn't change the execution of this test. I'll re-run jenkins for sanity. Attached patch had to rebase on trunk since TestDatanodeRestart was moved.

Eli Collins
added a comment - 02/Apr/12 23:04 Yea, it passes for me locally, given that the code paths don't change - hscync/flush and append are already enabled by default - shouldn't change the execution of this test. I'll re-run jenkins for sanity. Attached patch had to rebase on trunk since TestDatanodeRestart was moved.

Hadoop QA
added a comment - 02/Apr/12 23:44 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12521073/hdfs-3120.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 51 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2165//console
This message is automatically generated.

Hadoop QA
added a comment - 03/Apr/12 00:14 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12521073/hdfs-3120.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 51 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2164//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2164//console
This message is automatically generated.