Details

Append is not supported in Hadoop 1.x. Please upgrade to 2.x if you need append. If you enabled dfs.support.append for HBase, you're OK, as durable sync (why HBase required dfs.support.append) is now enabled by default. If you really need the previous functionality, to turn on the append functionality set the flag "dfs.support.broken.append" to true.

Append is not supported in Hadoop 1.x. Please upgrade to 2.x if you need append. If you enabled dfs.support.append for HBase, you're OK, as durable sync (why HBase required dfs.support.append) is now enabled by default. If you really need the previous functionality, to turn on the append functionality set the flag "dfs.support.broken.append" to true.

Suresh Srinivas
added a comment - 06/Jul/12 19:15 I had marked HADOOP-8365 as a blocker for 1.1.0.
Since HADOOP-8365 has not been fixed yet for 1.1.0, I am -1 on this patch. If HADOOP-8365 gets fixed, I will remove my -1.

Since you're worried that the sync path being buggy I think the best remedy is testing. The sync path in 1.0.x has been tested extensively by HBase users (including MR jobs), but we need to spend time looking at any bugs/changed behavior that might crop up as part of testing the Hadoop 1.x release anyway.

This change prevents known data loss on out of the box Hadoop 1.x installs, that seems more important than issues that could potentially come up during testing. I think our HBase users feel similarly. The reason I filed a separate jira for adding the flag is that I don't think that we should let users enable a code path that we know can result in data loss, and means we have to test two code paths. I also see your POV which is why I'm not -1 on this flag even though I don't like it.

Eli Collins
added a comment - 08/May/12 01:51 This change was proposed and discussed over a month ago:
https://issues.apache.org/jira/browse/HDFS-3120?focusedCommentId=13241903&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13241903
Since you're worried that the sync path being buggy I think the best remedy is testing. The sync path in 1.0.x has been tested extensively by HBase users (including MR jobs), but we need to spend time looking at any bugs/changed behavior that might crop up as part of testing the Hadoop 1.x release anyway.
This change prevents known data loss on out of the box Hadoop 1.x installs, that seems more important than issues that could potentially come up during testing. I think our HBase users feel similarly. The reason I filed a separate jira for adding the flag is that I don't think that we should let users enable a code path that we know can result in data loss, and means we have to test two code paths. I also see your POV which is why I'm not -1 on this flag even though I don't like it.

There are several clusters that I support that do not use sync, that currently runs with append turned off.

I'd think the number few and if there any still conscious this option even exists, they are probably suffering from the FUD that sync is buggy/broke. We should help them get over their misconception?

I agree that the code that is being enabled has been stable for some time, which is the main reason why it was ported to 0.20.205. However I would like to retain the existing behavior and not enable a change unnecessarily on these clusters. This avoids having to worry about or spend time looking at any bugs/changed behavior that might crop up.

For these kinds of changes (see several token related changes that happened in 1.x), I have always advocated adding a flag so existing deployments can stay unaffected. I am asking the same here. It is more important given this patch removed an option that existed to turn off new code.

if you feel strongly that we should have a config option that let's people keep the previous/broken sync behavior go for it

The need for an option is a comment on the patch committed in this jira. Sorry I could not comment quickly enough, as this patch was committed with a short turn around time. I think it should be addressed as a subsequent patch for this jira and not a separate optional item. Alternatively we could revert this change and rework it to add a flag.

Suresh Srinivas
added a comment - 07/May/12 20:24 Do you think there many users who'd want to do this Suresh?
There are several clusters that I support that do not use sync, that currently runs with append turned off.
I'd think the number few and if there any still conscious this option even exists, they are probably suffering from the FUD that sync is buggy/broke. We should help them get over their misconception?
I agree that the code that is being enabled has been stable for some time, which is the main reason why it was ported to 0.20.205. However I would like to retain the existing behavior and not enable a change unnecessarily on these clusters. This avoids having to worry about or spend time looking at any bugs/changed behavior that might crop up.
For these kinds of changes (see several token related changes that happened in 1.x), I have always advocated adding a flag so existing deployments can stay unaffected. I am asking the same here. It is more important given this patch removed an option that existed to turn off new code.
if you feel strongly that we should have a config option that let's people keep the previous/broken sync behavior go for it
The need for an option is a comment on the patch committed in this jira. Sorry I could not comment quickly enough, as this patch was committed with a short turn around time. I think it should be addressed as a subsequent patch for this jira and not a separate optional item. Alternatively we could revert this change and rework it to add a flag.

That is the behavior I want a user to be able to turn off with a flag.

Do you think there many users who'd want to do this Suresh? I'd think the number few and if there any still conscious this option even exists, they are probably suffering from the FUD that sync is buggy/broke. We should help them get over their misconception? (Pardon me if I am way off on this. Just offering an opinion from outer-left-field)

stack
added a comment - 07/May/12 20:17 That is the behavior I want a user to be able to turn off with a flag.
Do you think there many users who'd want to do this Suresh? I'd think the number few and if there any still conscious this option even exists, they are probably suffering from the FUD that sync is buggy/broke. We should help them get over their misconception? (Pardon me if I am way off on this. Just offering an opinion from outer-left-field)

I get that, what I'm saying is that I don't see the rationale for disabling the durable sync code paths. I'm -0 on HADOOP-8365, if you feel strongly that we should have a config option that let's people keep the previous/broken sync behavior go for it. I just don't see it.

Eli Collins
added a comment - 07/May/12 19:49 I get that, what I'm saying is that I don't see the rationale for disabling the durable sync code paths. I'm -0 on HADOOP-8365 , if you feel strongly that we should have a config option that let's people keep the previous/broken sync behavior go for it. I just don't see it.

There may be a misunderstanding: the dfs.support.append flag never controlled whether sync was enabled.

dfs.support.append turned off some code paths. These code paths are not just related to append. They enable durable sync. See the patch where it changes, "if support append then do x else do y" to do "x" without any check. That is the behavior I want a user to be able to turn off with a flag.

Suresh Srinivas
added a comment - 07/May/12 19:38 There may be a misunderstanding: the dfs.support.append flag never controlled whether sync was enabled.
dfs.support.append turned off some code paths. These code paths are not just related to append. They enable durable sync. See the patch where it changes, "if support append then do x else do y" to do "x" without any check. That is the behavior I want a user to be able to turn off with a flag.

@Suresh, I understand that someone said they want to run their installation w/ broken sync the question is why they want to run an installation with broken sync. I filed HADOOP-8365 for such a flag here, but there's currently no rationale for why you'd want to do that. Also, someone choosing to upgrade from 1.0.x to 1.1 is going to pick up new changes - and even new features - and most of them don't have a flag to disable them. Nothing new there.

@Koji, per your comment..

Before moving all of our non-HBase clusters to 2.0, we might use 1.1 for some time. During this period, I do not want some production projects to start relying on the sync features then find some regression/difference on 2.0 blocking our upgrade schedule.

There may be a misunderstanding: the dfs.support.append flag never controlled whether sync was enabled. DFSClient#sync and NN#fsync have always been available - enabling sync by default does not expose any new APIs to your users that were not previously available. The difference is that this fixes bugs for your users that were already using sync, which I think you'll want.

Eli Collins
added a comment - 07/May/12 19:24 @Suresh, I understand that someone said they want to run their installation w/ broken sync the question is why they want to run an installation with broken sync. I filed HADOOP-8365 for such a flag here, but there's currently no rationale for why you'd want to do that. Also, someone choosing to upgrade from 1.0.x to 1.1 is going to pick up new changes - and even new features - and most of them don't have a flag to disable them. Nothing new there.
@Koji, per your comment..
Before moving all of our non-HBase clusters to 2.0, we might use 1.1 for some time. During this period, I do not want some production projects to start relying on the sync features then find some regression/difference on 2.0 blocking our upgrade schedule.
There may be a misunderstanding: the dfs.support.append flag never controlled whether sync was enabled. DFSClient#sync and NN#fsync have always been available - enabling sync by default does not expose any new APIs to your users that were not previously available . The difference is that this fixes bugs for your users that were already using sync, which I think you'll want.

Suresh Srinivas
added a comment - 07/May/12 18:16 what is that use case?
I think I have explained it in the comments above. To repeat:
"Still, I think we should retain ability to turn it off, because I want to continue running my installation that way and this patch removes that ability."

stack
added a comment - 07/May/12 16:22 This was turned on in the 20 originally and then we had to turn it off due bugs.
The "bugs" were fixed a good while ago.
Can you please explain the reason the make this change?
+ HBase needs it.
+ Its broken that users have to flip a configuration flag, then stop+start, to make a basic fs api method work.

I'm probably a minority here but I do want the "an option to disable durable sync" not because it could be buggy but because it could be too stable compared to 0.23/2.0. Before moving all of our non-HBase clusters to 2.0, we might use 1.1 for some time. During this period, I do not want some production projects to start relying on the sync features then find some regression/difference on 2.0 blocking our upgrade schedule.

Koji Noguchi
added a comment - 07/May/12 16:04 , but what is that use case?
I'm probably a minority here but I do want the "an option to disable durable sync" not because it could be buggy but because it could be too stable compared to 0.23/2.0. Before moving all of our non-HBase clusters to 2.0, we might use 1.1 for some time. During this period, I do not want some production projects to start relying on the sync features then find some regression/difference on 2.0 blocking our upgrade schedule.

We shouldn't provide a flag that enables append because we know append has data loss issues

HBase and other programs have data loss when running against a default Hadoop 1.x install

The rationale is pretty clear - this prevents data loss.

Per my earlier comment, I'm open to having an option to disable durable sync if you think that use case is important, but what is that use case? Given that the sync code path is well tested and debugged, why would you want to run with a buggy sync implementation?

Eli Collins
added a comment - 07/May/12 05:32 Suresh, Sanjay, the rationale is discussed in HDFS-3120 . In short:
We shouldn't provide a flag that enables append because we know append has data loss issues
HBase and other programs have data loss when running against a default Hadoop 1.x install
The rationale is pretty clear - this prevents data loss.
Per my earlier comment, I'm open to having an option to disable durable sync if you think that use case is important, but what is that use case? Given that the sync code path is well tested and debugged, why would you want to run with a buggy sync implementation?

This was turned on in the 20 originally and then we had to turn it off due bugs.

Then several us spent many months fixing those bugs, and we haven't seen any since.

Given that the default in Hadoop 1 is off, why not leave it off and give a way to turn it on.
The current default is off and I don't see a reason to change that default in 1.1.
There are installations have that are using the current default.
Can you please explain the reason the make this change?

It's a pain for HBase users to have to manually flip this, and risk data loss if they don't. Changing the default also means we have fewer code paths to maintain for the average user.

Todd Lipcon
added a comment - 07/May/12 00:46 This was turned on in the 20 originally and then we had to turn it off due bugs.
Then several us spent many months fixing those bugs, and we haven't seen any since.
Given that the default in Hadoop 1 is off, why not leave it off and give a way to turn it on.
The current default is off and I don't see a reason to change that default in 1.1.
There are installations have that are using the current default.
Can you please explain the reason the make this change?
It's a pain for HBase users to have to manually flip this, and risk data loss if they don't. Changing the default also means we have fewer code paths to maintain for the average user.

> Couldn't the same be said for any new feature? Given that sync was fixed prior to the 1.0 release,
> I don't see why this should be considered an incompatible change.
This was turned on in the 20 originally and then we had to turn it off due bugs.
Given that the default in Hadoop 1 is off, why not leave it off and give a way to turn it on.
The current default is off and I don't see a reason to change that default in 1.1.
There are installations have that are using the current default.
Can you please explain the reason the make this change?

Sanjay Radia
added a comment - 06/May/12 22:27 > Couldn't the same be said for any new feature? Given that sync was fixed prior to the 1.0 release,
> I don't see why this should be considered an incompatible change.
This was turned on in the 20 originally and then we had to turn it off due bugs.
Given that the default in Hadoop 1 is off, why not leave it off and give a way to turn it on.
The current default is off and I don't see a reason to change that default in 1.1.
There are installations have that are using the current default.
Can you please explain the reason the make this change?

Adding dfs.support.sync flag is along the lines of my previous comments. I am reluctantly okay with enabling it by default. This should be a blocker on 1.1. It might be easy to revert this patch, and add the new flag, as lot of paths to be enabled by the new flag are removed in this patch.

Suresh Srinivas
added a comment - 06/May/12 22:16 Adding dfs.support.sync flag is along the lines of my previous comments. I am reluctantly okay with enabling it by default. This should be a blocker on 1.1. It might be easy to revert this patch, and add the new flag, as lot of paths to be enabled by the new flag are removed in this patch.

Seems a reasonable compromise is to instate a dfs.support.sync flag, but set it to true by default. That way those who are nervous about the bug fix can disable it, but the average user who can have HBase/etc working without additional configuration.

Todd Lipcon
added a comment - 03/May/12 22:36 Seems a reasonable compromise is to instate a dfs.support.sync flag, but set it to true by default. That way those who are nervous about the bug fix can disable it, but the average user who can have HBase/etc working without additional configuration.

We've had sync on by default in hundreds of our customer clusters for almost two years now and have yet to see a related data-loss event. The only bugs we've seen have been bugs where sync() wouldn't provide the correct semantics, but for installs which don't use sync, that doesn't matter.

That is great.

Still, I think we should retain ability to turn it off, because I want to continue running my installation that way and this patch removes that ability.

Suresh Srinivas
added a comment - 03/May/12 22:31 We've had sync on by default in hundreds of our customer clusters for almost two years now and have yet to see a related data-loss event. The only bugs we've seen have been bugs where sync() wouldn't provide the correct semantics, but for installs which don't use sync, that doesn't matter.
That is great.
Still, I think we should retain ability to turn it off, because I want to continue running my installation that way and this patch removes that ability.

Couldn't the same be said for any new feature? Given that sync was fixed prior to the 1.0 release, I don't see why this should be considered an incompatible change. Many much bigger incompatible changes went into 1.0 when compared to earlier 0.20.20x or 0.20.x releases (e.g tarball layout entirely changed, for example). This doesn't affect any of the existing APIs, only the underlying implementations.

Given this is committed for 1.1, not 1.0.x, I don't buy the reasoning that it's too much risk. We've had sync on by default in hundreds of our customer clusters for almost two years now and have yet to see a related data-loss event. The only bugs we've seen have been bugs where sync() wouldn't provide the correct semantics, but for installs which don't use sync, that doesn't matter.

Todd Lipcon
added a comment - 03/May/12 19:59 Couldn't the same be said for any new feature? Given that sync was fixed prior to the 1.0 release, I don't see why this should be considered an incompatible change. Many much bigger incompatible changes went into 1.0 when compared to earlier 0.20.20x or 0.20.x releases (e.g tarball layout entirely changed, for example). This doesn't affect any of the existing APIs, only the underlying implementations.
Given this is committed for 1.1, not 1.0.x, I don't buy the reasoning that it's too much risk. We've had sync on by default in hundreds of our customer clusters for almost two years now and have yet to see a related data-loss event. The only bugs we've seen have been bugs where sync() wouldn't provide the correct semantics, but for installs which don't use sync, that doesn't matter.

The sync fixes were a large number of individual fixes. They added risk for users that were not using the sync feature. Hence Sync was kept off by default for such users - this was a very conscious decision.
Sync should be left off by default with an option to turn it on. This is an incompatible change.

Sanjay Radia
added a comment - 03/May/12 18:15 The sync fixes were a large number of individual fixes. They added risk for users that were not using the sync feature. Hence Sync was kept off by default for such users - this was a very conscious decision.
Sync should be left off by default with an option to turn it on. This is an incompatible change.

Making sync actually work is a bug fix, it was a bug that we allowed people to call sync and unlike append there wasn't a flag to enable it that was disabled by default. Better to fix the default behavior (which allows you to sync).

The implementation earlier used dfs.supports.append to support both durable sync and append. When this flag is off, whole bunch of code got turned off, related to sync functionality on how the blocks are stored, block reports etc. Now with this change, this code can no longer be turned off. I agree with enabling sync by default. However, for folks who chose not to enable the related code and not impacted by it, we need to add a flag to turn off that functionality.

Suresh Srinivas
added a comment - 03/May/12 18:00 Would such an installation be using the sync call?
No from what I know.
From what I understand, the intention of this change is to:
Disable append, since 1.x has bugs in that implementation.
Enable sync by default.
Making sync actually work is a bug fix, it was a bug that we allowed people to call sync and unlike append there wasn't a flag to enable it that was disabled by default. Better to fix the default behavior (which allows you to sync).
The implementation earlier used dfs.supports.append to support both durable sync and append. When this flag is off, whole bunch of code got turned off, related to sync functionality on how the blocks are stored, block reports etc. Now with this change, this code can no longer be turned off. I agree with enabling sync by default. However, for folks who chose not to enable the related code and not impacted by it, we need to add a flag to turn off that functionality.

For testing sync, with this patch, since it is enabled by default, you do not need the flag right?

Correct, after my patch the tests that no longer use append no longer set the append flag. The tests that call append to get its side effects still use the append flag.

Agree w Stack wrt the previous comment. Making sync actually work is a bug fix, it was a bug that we allowed people to call sync and unlike append there wasn't a flag to enable it that was disabled by default. Better to fix the default behavior (which allows you to sync).

Eli Collins
added a comment - 03/May/12 07:24 For testing sync, with this patch, since it is enabled by default, you do not need the flag right?
Correct, after my patch the tests that no longer use append no longer set the append flag. The tests that call append to get its side effects still use the append flag.
Agree w Stack wrt the previous comment. Making sync actually work is a bug fix, it was a bug that we allowed people to call sync and unlike append there wasn't a flag to enable it that was disabled by default. Better to fix the default behavior (which allows you to sync).

stack
added a comment - 03/May/12 05:19 When an installation upgrades to a release with this patch, suddenly sync is enabled and there is no way to disable it.
Would such an installation be using the sync call?

Wrt #2 personally I don't think we should allow people to disable durable sync as that can result in data loss for people running HBase. See HADOOP-8230 for more info. I'm open to having an option to disable durable sync if you think that use case is important.

There are installations where HBase is not used and sync was disabled. Now this patch has removed that option. When an installation upgrades to a release with this patch, suddenly sync is enabled and there is no way to disable it.

(1) there are tests that are using append not to test append per se but for the side effects and we'd lose sync test coverage by removing those tests and (2) per the description we're keeping the append code path in case someone wants to fix the data loss issues in which case it makes sense to keep the test coverage as well.

For testing sync, with this patch, since it is enabled by default, you do not need the flag right?

Suresh Srinivas
added a comment - 02/May/12 17:53 Wrt #2 personally I don't think we should allow people to disable durable sync as that can result in data loss for people running HBase. See HADOOP-8230 for more info. I'm open to having an option to disable durable sync if you think that use case is important.
There are installations where HBase is not used and sync was disabled. Now this patch has removed that option. When an installation upgrades to a release with this patch, suddenly sync is enabled and there is no way to disable it.
(1) there are tests that are using append not to test append per se but for the side effects and we'd lose sync test coverage by removing those tests and (2) per the description we're keeping the append code path in case someone wants to fix the data loss issues in which case it makes sense to keep the test coverage as well.
For testing sync, with this patch, since it is enabled by default, you do not need the flag right?

Wrt #1 see this comment in HDFS-3120 that outlines the proposal that Todd, Nicholas and I thought was best. Feel free to file a follow-on jira for an improvement, happy to review. I'll update the description to match the proposal.

Wrt #2 personally I don't think we should allow people to disable durable sync as that can result in data loss for people running HBase. See HADOOP-8230 for more info. I'm open to having an option to disable durable sync if you think that use case is important.

Wrt #3 the rationale was two-fold: (1) there are tests that are using append not to test append per se but for the side effects and we'd lose sync test coverage by removing those tests and (2) per the description we're keeping the append code path in case someone wants to fix the data loss issues in which case it makes sense to keep the test coverage as well.

Eli Collins
added a comment - 02/May/12 05:41 - edited Thanks for chiming in Suresh.
Wrt #1 see this comment in HDFS-3120 that outlines the proposal that Todd, Nicholas and I thought was best. Feel free to file a follow-on jira for an improvement, happy to review. I'll update the description to match the proposal.
Wrt #2 personally I don't think we should allow people to disable durable sync as that can result in data loss for people running HBase. See HADOOP-8230 for more info. I'm open to having an option to disable durable sync if you think that use case is important.
Wrt #3 the rationale was two-fold: (1) there are tests that are using append not to test append per se but for the side effects and we'd lose sync test coverage by removing those tests and (2) per the description we're keeping the append code path in case someone wants to fix the data loss issues in which case it makes sense to keep the test coverage as well.

Suresh Srinivas
added a comment - 30/Apr/12 20:50 Eli, sorry for the late comment. I agree with the general direction of splitting hflush/hsync feature from append. Perhaps these features should be using two different flags.
I have concerns with this change:
I thought the proposal from HDFS-3120 was to add "dfs.support.sync". I do not see that flag in this patch.
There are installations where hsync/hflush is disabled, using dfs.support.append. That option should be preserved.
"dfs.support.broken.append" - why add this and not delete the tests that are testing append functionality?