IMO 22 is not the best place to invest resources. I support nigel's suggestion of abandoning it, but people are free to work on what they are passionate about.

E14

On Sep 3, 2011, at 11:11 AM, Nigel Daley wrote:

> Matt, Others,> > Is the expectation that these fixes go into other release branches too (including 0.22, 0.23) if applicable? > > If not, my concern is that 0.22 is regressing further from the popular Apache release and I'm inclined to abandon 0.22. Thoughts?> > Cheers,> Nige

I am very interested in hearing what the community thinks about thisissue. I believe what happens here has long-term consequences.

AKAIK, the bylaws do not mention anything about "abandoning release". Ifpast is any indication, even if releases 0.19.*, and 0.21.* were notadopted by large hadoop installations, these releases were cut, voted on,and approved. Not "abandoned".

I am -1 (non-binding) on abandoning 0.22.

- Milind

---Milind BhandarkarGreenplum Labs, EMC(Disclaimer: Opinions expressed in this email are those of the author, anddo not necessarily represent the views of any organization, past orpresent, the author might be affiliated with.)

On 9/6/11 4:06 AM, "Eric Baldeschwieler" <[EMAIL PROTECTED]> wrote:

>What do others think?>>IMO 22 is not the best place to invest resources. I support nigel's>suggestion of abandoning it, but people are free to work on what they are>passionate about.>>E14>>On Sep 3, 2011, at 11:11 AM, Nigel Daley wrote:>>> Matt, Others,>> >> Is the expectation that these fixes go into other release branches too>>(including 0.22, 0.23) if applicable?>> >> If not, my concern is that 0.22 is regressing further from the popular>>Apache release and I'm inclined to abandon 0.22. Thoughts?>> >> Cheers,>> Nige>>

Nigel is the RM for the current 0.22 branch. If he abandons it thenthat branch is done, but nobody has a lock on the next release. Anycommitter can manage a branch, package its contents, and propose anartifact. If it receives a majority of votes from the PMC, then itbecomes 0.22. Whether it's derived from the current 0.22 branch is upto that RM.

Anyone keen to see a particular release must do the work to effect it.We don't assign work by voting.

On Tue, Sep 6, 2011 at 12:08 PM, <[EMAIL PROTECTED]> wrote:> AKAIK, the bylaws do not mention anything about "abandoning release". If> past is any indication, even if releases 0.19.*, and 0.21.* were not> adopted by large hadoop installations, these releases were cut, voted on,> and approved. Not "abandoned".

To your point, a branch targeted as 0.21 was abandoned, despite monthsof backporting changes from trunk. A branch containing subsequent workwas released instead. Whether the current 0.23 branch is released as0.22 will depend exclusively on their respective effort and support.

Though you're right: this probably should be written into the bylaws,even if any contrary model is illusory in practice. -C

It would take the same amount of resources to fix 0.22 as to mergeappend and security branches aka 0.20.205.Although I understand that Hortonworks needs to support itscustomer(s) and is eager to bridge the gap in functionality with itscompetitor(s), I think continuing with 0.20a-three-years-old-technology is not the best place to investresources. In the past you advocated for 0.21 and 0.22, both nowabandoned by your team(s) in favor of enhancing 0.20. It will be sadto see this backward/forward porting going on forever, diverging theApache Hadoop project from natural evolutionary process.

I think 0.22 has all the functionality required to run Hadoop for mostproduction tasks. I see enough momentum and involvement in thecommunity with 0.22 testing. I think there will be enough resources toget it stabilized in near future.

Nigel,

your comment can be understood as a request to commit important fixesto the 0.22 branch. I agree with that. But if you choose to abandonthe RM role I will volunteer to take it over.

Thanks,--Konstantin

On Tue, Sep 6, 2011 at 4:06 AM, Eric Baldeschwieler<[EMAIL PROTECTED]> wrote:>> What do others think?>> IMO 22 is not the best place to invest resources. I support nigel's suggestion of abandoning it, but people are free to work on what they are passionate about.>> E14>> On Sep 3, 2011, at 11:11 AM, Nigel Daley wrote:>> > Matt, Others,> >> > Is the expectation that these fixes go into other release branches too (including 0.22, 0.23) if applicable?> >> > If not, my concern is that 0.22 is regressing further from the popular Apache release and I'm inclined to abandon 0.22. Thoughts?> >> > Cheers,> > Nige>

> I am very interested in hearing what the community thinks about this> issue. I believe what happens here has long-term consequences.> > AKAIK, the bylaws do not mention anything about "abandoning release". If> past is any indication, even if releases 0.19.*, and 0.21.* were not> adopted by large hadoop installations, these releases were cut, voted on,> and approved. Not "abandoned".> > I am -1 (non-binding) on abandoning 0.22.>

I would note that folks will use the latest release if it has features they want. I know of about 8PB of HDFS deployments that used 0.19. Bugs were found, bugs were squashed, patches were committed.

It would be nice to not waste the 0.22 effort, even if we call it a "development release" in the end.

I've never advocated 21 or 22 since I've never been able to volunteerhelp to work on them.

I think it's great if you want to run another branch, do testing etc.But given that it's neither complete, stable nor the latest majoritysupported project I think the expectation should be that the RM /community behind 22 should take responsibility for backportingwhatever patches they care about from trunk or 23.

(just like 0.20.20*)

Also just like 20 the age is irrelevant, what maters is what folks arevolunteering to work on. I support you volunteering to do whateverinterests you. I'm marshaling help for 20 & 23 because these seem likethe best ways to address the needs of the widest community IMO.

> Eric,>> It would take the same amount of resources to fix 0.22 as to merge> append and security branches aka 0.20.205.> Although I understand that Hortonworks needs to support its> customer(s) and is eager to bridge the gap in functionality with its> competitor(s), I think continuing with 0.20> a-three-years-old-technology is not the best place to invest> resources. In the past you advocated for 0.21 and 0.22, both now> abandoned by your team(s) in favor of enhancing 0.20. It will be sad> to see this backward/forward porting going on forever, diverging the> Apache Hadoop project from natural evolutionary process.>> I think 0.22 has all the functionality required to run Hadoop for most> production tasks. I see enough momentum and involvement in the> community with 0.22 testing. I think there will be enough resources to> get it stabilized in near future.>> Nigel,>> your comment can be understood as a request to commit important fixes> to the 0.22 branch. I agree with that. But if you choose to abandon> the RM role I will volunteer to take it over.>> Thanks,> --Konstantin>> On Tue, Sep 6, 2011 at 4:06 AM, Eric Baldeschwieler> <[EMAIL PROTECTED]> wrote:>>>> What do others think?>>>> IMO 22 is not the best place to invest resources. I support nigel's suggestion of abandoning it, but people are free to work on what they are passionate about.>>>> E14>>>> On Sep 3, 2011, at 11:11 AM, Nigel Daley wrote:>>>>> Matt, Others,>>>>>> Is the expectation that these fixes go into other release branches too (including 0.22, 0.23) if applicable?>>>>>> If not, my concern is that 0.22 is regressing further from the popular Apache release and I'm inclined to abandon 0.22. Thoughts?>>>>>> Cheers,>>> Nige>>

On Sep 7, 2011, at 2:04 AM, Konstantin Shvachko wrote:> I think continuing with 0.20> a-three-years-old-technology is not the best place to invest> resources. In the past you advocated for 0.21 and 0.22, both now> abandoned by your team(s) in favor of enhancing 0.20. It will be sad> to see this backward/forward porting going on forever, diverging the> Apache Hadoop project from natural evolutionary process.

Using real data helps - from Apache Jira, here are the statistics for work on trunk/hadoop-0.23 in Q3 of 2011 (i.e. last 2 months alone):

Total of 538 jiras resolved. I'm sure LOC is much more impressive, but be that as it may.

OTOH, I've haven't run reports (can't figure), but a cursory glance of CHANGES.txt shows hadoop-0.20.2xx has < 50, while 0.22 has ~10.

So, I'd say it's pretty clear that there is significant interest, involvement & investment from the wider community. That makes me believe that Apache Hadoop is moving along the right path i.e. forward.

thanks,Arun

* I'm aware that 80% of stats are only 80% right *smile* - these are 'resolved' jiras. It's fair to assume vast majority of the jiras were actually 'fixed'.

On Sep 7, 2011, at 9:15 AM, Arun C Murthy wrote:> Using real data helps - from Apache Jira, here are the statistics for work on trunk/hadoop-0.23 in Q3 of 2011 (i.e. last 2 months alone):> > Hadoop Common - 224 resolved* jiras> Hadoop HDFS - 153 resolved* jiras> Hadoop MapReduce - 161* resolved jiras> > Total of 538 jiras resolved. I'm sure LOC is much more impressive, but be that as it may.

I wonder about the usefulness of those stats though. Many of those jiras are bug fixes to other jiras committed in the same time frame and also committed to other branches. Also, I've noticed an absolutely explosion of sub-tasks where one big patch is actually done at commit time.

> So, I'd say it's pretty clear that there is significant interest, involvement & investment from the wider community. That makes me believe that Apache Hadoop is moving along the right path i.e. forward.

Part of the problem that the Hadoop community has is a PMC and committer group that leans heavily towards a few organizations. Any movement that those organizations do offsets what the rest of the community may or may not want to do. The reality is that if anyone who isn't HortonWorks or Cloudera wants to do a release, it is likely doomed at PMC vote time.

[I'll hold off on commenting on the effectiveness of our current PMC member roster. That's a different discussion altogether.]

On Wed, Sep 07, 2011 at 02:04AM, Konstantin Shvachko wrote:> Eric,> > It would take the same amount of resources to fix 0.22 as to merge> append and security branches aka 0.20.205.> Although I understand that Hortonworks needs to support its> customer(s) and is eager to bridge the gap in functionality with its> competitor(s), I think continuing with 0.20> a-three-years-old-technology is not the best place to invest> resources. In the past you advocated for 0.21 and 0.22, both now> abandoned by your team(s) in favor of enhancing 0.20. It will be sad> to see this backward/forward porting going on forever, diverging the> Apache Hadoop project from natural evolutionary process.> > I think 0.22 has all the functionality required to run Hadoop for most> production tasks. I see enough momentum and involvement in the> community with 0.22 testing. I think there will be enough resources to> get it stabilized in near future.> > Nigel,> > your comment can be understood as a request to commit important fixes> to the 0.22 branch. I agree with that. But if you choose to abandon> the RM role I will volunteer to take it over.

I think this is a grand idea! 0.22 is very close to the stable state. On theother hand there's no indication of how soon the community can expect to get0.23 out.

I will be happy to offer any help to complete 0.22, Konstantin!

Cos

> Thanks,> --Konstantin> > On Tue, Sep 6, 2011 at 4:06 AM, Eric Baldeschwieler> <[EMAIL PROTECTED]> wrote:> >> > What do others think?> >> > IMO 22 is not the best place to invest resources. I support nigel's suggestion of abandoning it, but people are free to work on what they are passionate about.> >> > E14> >> > On Sep 3, 2011, at 11:11 AM, Nigel Daley wrote:> >> > > Matt, Others,> > >> > > Is the expectation that these fixes go into other release branches too (including 0.22, 0.23) if applicable?> > >> > > If not, my concern is that 0.22 is regressing further from the popular Apache release and I'm inclined to abandon 0.22. Thoughts?> > >> > > Cheers,> > > Nige> >

On 07/09/11 10:04, Konstantin Shvachko wrote:> Eric,>> It would take the same amount of resources to fix 0.22 as to merge> append and security branches aka 0.20.205.> Although I understand that Hortonworks needs to support its> customer(s) and is eager to bridge the gap in functionality with its> competitor(s), I think continuing with 0.20> a-three-years-old-technology is not the best place to invest> resources. In the past you advocated for 0.21 and 0.22, both now> abandoned by your team(s) in favor of enhancing 0.20. It will be sad> to see this backward/forward porting going on forever, diverging the> Apache Hadoop project from natural evolutionary process.>> I think 0.22 has all the functionality required to run Hadoop for most> production tasks. I see enough momentum and involvement in the> community with 0.22 testing. I think there will be enough resources to> get it stabilized in near future.

this is interesting.

1. I've been doing some 0.20.x work and hitting bugs that I know have been fixed in trunk a while ago but never backported as they were things that weren't critical enough to the people using the 20.x branches (i.e problems related to my home network, issues w/ embedding the JARs, etc).

This is why I have to disagree with eric14's "age is irrelevant" claim. The APIs show their age, so do other quirks. It's just a known set of quirks -like WindowsXP is today.

2. 0.23 will take a while to stabilise; a big barrier is that projects on top of hadoop need to test it. Bigtop can help here, but it will still take time.

3. Where is all maintenance of MR 1.0 code going to go? It can't go in trunk or 0.23, as that's on MR2.x. Should all changes to MR1.0 be backported to 0.20.x, and new stuff put in there?

Or are we declaring a complete block on all upgrade paths that don't involve a migration to 0.23 or staying on the 0.20.x branch -with only a subset of fixes and an aging API- available?

The other reason for a 0.22 release is Apache Bigtop only plans to release full stacks of released ASF code, and a lot of the things in the ASF Hadoop ecosystem do need the 0.21+ APIs (MRUnit, flume). A 0.22 release is something that bigtop could get behind.

On Fri, Sep 9, 2011 at 3:40 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:> On 07/09/11 10:04, Konstantin Shvachko wrote:>>>> Eric,>>>> It would take the same amount of resources to fix 0.22 as to merge>> append and security branches aka 0.20.205.>> Although I understand that Hortonworks needs to support its>> customer(s) and is eager to bridge the gap in functionality with its>> competitor(s), I think continuing with 0.20>> a-three-years-old-technology is not the best place to invest>> resources. In the past you advocated for 0.21 and 0.22, both now>> abandoned by your team(s) in favor of enhancing 0.20. It will be sad>> to see this backward/forward porting going on forever, diverging the>> Apache Hadoop project from natural evolutionary process.>>>> I think 0.22 has all the functionality required to run Hadoop for most>> production tasks. I see enough momentum and involvement in the>> community with 0.22 testing. I think there will be enough resources to>> get it stabilized in near future.>> this is interesting.>> 1. I've been doing some 0.20.x work and hitting bugs that I know have been> fixed in trunk a while ago but never backported as they were things that> weren't critical enough to the people using the 20.x branches (i.e problems> related to my home network, issues w/ embedding the JARs, etc).>> This is why I have to disagree with eric14's "age is irrelevant" claim. The> APIs show their age, so do other quirks. It's just a known set of quirks> -like WindowsXP is today.>> 2. 0.23 will take a while to stabilise; a big barrier is that projects on> top of hadoop need to test it. Bigtop can help here, but it will still take> time.>> 3. Where is all maintenance of MR 1.0 code going to go? It can't go in trunk> or 0.23, as that's on MR2.x. Should all changes to MR1.0 be backported to> 0.20.x, and new stuff put in there?

MR1 is being maintained on 20x. In fact 20x is the only MR1 code thatsupports security and disk failure handling. The MR1 code in 22 is aregression in some significant aspects (features, performance, bugs)from the latest stable MR1 (204).

>> Or are we declaring a complete block on all upgrade paths that don't involve> a migration to 0.23 or staying on the 0.20.x branch -with only a subset of> fixes and an aging API- available?

My understanding of the way Apache operates is that you can't dothings like "declare blocks on upgrade paths". People can try torelease updates to 21 or 22 (or some new tree). Ie the decisions aremade implicitly by where people invest cycles.

> MR1 is being maintained on 20x. In fact 20x is the only MR1 code that supports security and disk failure handling. The MR1 code in 22 is a regression in some significant aspects (features, performance, bugs) from the latest stable MR1 (204)....

Eli, aside from the disk failure handling which is a new feature in 205 (not present in earlier 20x releases), could you please elaborate on which other significant aspects 22 would regress from 20x?

> My understanding of the way Apache operates is that you can't do things like "declare blocks on upgrade paths". People can try to release updates to 21 or 22 (or some new tree). Ie the decisions are made implicitly by where people invest cycles.

If a group of committers say that they'll commit to trunk, to 0.23 and to 0.20x, but not to 0.22, then in effect that is like to "declare a block on upgrade path" isn't it? The more such commits that go in to other branches, but not the ones in between essentially is a declaration of a block, because of the very regression argument you make.

I agree that nobody can be made to contribute to something they don't want, but does that result into a split?In other words, if a significant bug fix or feature goes into trunk and into 22, can developers then simply say: "I'm not interested to put this into 23, you do this yourself if you want?". Will that be tolerated or vetoed?

On Fri, Sep 9, 2011 at 10:41 AM, Rottinghuis, Joep<[EMAIL PROTECTED]> wrote:>>> -----Original Message-----> From: Eli Collins [mailto:[EMAIL PROTECTED]]> Sent: Friday, September 09, 2011 10:03 AM> To: [EMAIL PROTECTED]> Subject: Re: abandoning 22 - was: Content request for 0.20.205 Sustaining Release>> ...>>> MR1 is being maintained on 20x. In fact 20x is the only MR1 code that supports security and disk failure handling. The MR1 code in 22 is a regression in some significant aspects (features, performance, bugs) from the latest stable MR1 (204).> ...>> Eli, aside from the disk failure handling which is a new feature in 205 (not present in earlier 20x releases), could you please elaborate on which other significant aspects 22 would regress from 20x?

Check out the MR jiras in the branch-20-security change log. There's aton of performance, feature and stability work. 22 doesn't have mostof this.>>> My understanding of the way Apache operates is that you can't do things like "declare blocks on upgrade paths". People can try to release updates to 21 or 22 (or some new tree). Ie the decisions are made implicitly by where people invest cycles.>> If a group of committers say that they'll commit to trunk, to 0.23 and to 0.20x, but not to 0.22, then in effect that is like to "declare a block on upgrade path" isn't it?

The release manager - not the developers - are responsible for andhave the final say as to what patches get merge to their branch. Ifthe RM wants all this work they need to either corral the developersto do the merging or do the merging themselves. In short, it's theirresponsibility to get people to invest in the branch.

On Sep 9, 2011, at 10:41 AM, Rottinghuis, Joep wrote:No one is going to block you from doing any work you want.

All that is required is to have the work in trunk and subsequent branches i.e 0.23 (as applicable) .

The problem is that 0.22 hasn't seen major movement for almost a year since it was branched and there is no incentive for lots of people to contribute - plus there is MRv2 which is completely different beast (see the discussion that Eli pointed to).

None of this is to say you shouldn't contribute or use 0.22, you move the project as you wish by your contributions.

>> MR1 is being maintained on 20x. In fact 20x is the only MR1 code that supports security and disk failure handling. The MR1 code in 22 is a regression in some significant aspects (features, performance, bugs) from the latest stable MR1 (204).> ...> > Eli, aside from the disk failure handling which is a new feature in 205 (not present in earlier 20x releases), could you please elaborate on which other significant aspects 22 would regress from 20x? >

I've talked to Konstantin about this.

There is tonnes of performance work missing, including scaling work on JobTracker, CapacityScheduler etc. There is work to add a ton of limits (counters, tasks, etc. etc.). Then there is operability work such as JobHistory, handling logs etc. The last time I benchmarked 22 v/s 20.xxx there was >5x difference and that was more than a year ago. Arguably some of the operability work won't matter for small clusters, but you are welcome to make your own decisions.

It's unfortunate we have landed here, but 22 branched almost a year ago and hence none of this work was ported there since a branch implies critical bugs was supposed to go in. That plus the problems with scaling MR1 which led to investment in MR1 is where we are. As a result, there is no enthusiasm to contribute to MR1 from vast majority of devs given that we've decided we won't support it. >> My understanding of the way Apache operates is that you can't do things like "declare blocks on upgrade paths". People can try to release updates to 21 or 22 (or some new tree). Ie the decisions are made implicitly by where people invest cycles.> > If a group of committers say that they'll commit to trunk, to 0.23 and to 0.20x, but not to 0.22, then in effect that is like to "declare a block on upgrade path" isn't it? The more such commits that go in to other branches, but not the ones in between essentially is a declaration of a block, because of the very regression argument you make.> > I agree that nobody can be made to contribute to something they don't want, but does that result into a split?> In other words, if a significant bug fix or feature goes into trunk and into 22, can developers then simply say: "I'm not interested to put this into 23, you do this yourself if you want?". Will that be tolerated or vetoed?>

Again, no one is going to veto anything. As Eli said decisions are make by people's code.

On Fri, Sep 9, 2011 at 11:08 AM, Eli Collins <[EMAIL PROTECTED]> wrote:> The release manager - not the developers - are responsible for and> have the final say as to what patches get merge to their branch. If> the RM wants all this work they need to either corral the developers> to do the merging or do the merging themselves. In short, it's their> responsibility to get people to invest in the branch.

On Fri, Sep 09, 2011 at 12:21PM, Chris Douglas wrote:> On Fri, Sep 9, 2011 at 11:08 AM, Eli Collins <[EMAIL PROTECTED]> wrote:> > The release manager - not the developers - are responsible for and> > have the final say as to what patches get merge to their branch. ═If> > the RM wants all this work they need to either corral the developers> > to do the merging or do the merging themselves. In short, it's their> > responsibility to get people to invest in the branch.> > This.> > The rest of this thread is pointless. -C

On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:> The release manager - not the developers - are responsible for and> have the final say as to what patches get merge to their branch.

That is simply false. At no time is any individual responsible forany part of Apache subversion, no matter how obscure the branch.No RM has the final say on anything other than their own work.That is, they can choose not to produce a release candidate.Furthermore, at no time whatsoever does any person "own" the jobof being RM -- there can be five active RMs on a single branch,each producing release candidates based on what is in subversionat the particular time that they decide to tag and build.

> If the RM wants all this work they need to either corral the developers> to do the merging or do the merging themselves. In short, it's their> responsibility to get people to invest in the branch.

>On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:>> The release manager - not the developers - are responsible for and>> have the final say as to what patches get merge to their branch.>>That is simply false. At no time is any individual responsible for>any part of Apache subversion, no matter how obscure the branch.>No RM has the final say on anything other than their own work.>That is, they can choose not to produce a release candidate.>Furthermore, at no time whatsoever does any person "own" the job>of being RM -- there can be five active RMs on a single branch,>each producing release candidates based on what is in subversion>at the particular time that they decide to tag and build.>>> If the RM wants all this work they need to either corral the developers>> to do the merging or do the merging themselves. In short, it's their>> responsibility to get people to invest in the branch.>>That is true of anyone, on or off the PMC -- not just the RM.>>....Roy>>

On Fri, Sep 09, 2011 at 05:08PM, [EMAIL PROTECTED] wrote:> Has anyone seen the RM for 0.22 lately ?> > - milind> > On 9/9/11 1:46 PM, "Roy T. Fielding" <[EMAIL PROTECTED]> wrote:> > >On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:> >> The release manager - not the developers - are responsible for and> >> have the final say as to what patches get merge to their branch.> >> >That is simply false. At no time is any individual responsible for> >any part of Apache subversion, no matter how obscure the branch.> >No RM has the final say on anything other than their own work.> >That is, they can choose not to produce a release candidate.> >Furthermore, at no time whatsoever does any person "own" the job> >of being RM -- there can be five active RMs on a single branch,> >each producing release candidates based on what is in subversion> >at the particular time that they decide to tag and build.> >> >> If the RM wants all this work they need to either corral the developers> >> to do the merging or do the merging themselves. In short, it's their> >> responsibility to get people to invest in the branch.> >> >That is true of anyone, on or off the PMC -- not just the RM.> >> >....Roy> >> >>

On Fri, Sep 9, 2011 at 1:46 PM, Roy T. Fielding <[EMAIL PROTECTED]> wrote:> On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:>> The release manager - not the developers - are responsible for and>> have the final say as to what patches get merge to their branch.>> That is simply false. At no time is any individual responsible for> any part of Apache subversion, no matter how obscure the branch.

This seems to contradict the Apache release policy(http://httpd.apache.org/dev/release.html), and the practice adoptedby this project. Eg "Regarding what makes it into a release, the RM isthe unquestioned authority. No one can contest what makes it into therelease."

If there is no individual responsible for a release branch then who is"the RM" that is the unquestioned authority for a release?

The page states that "there is no set RM" however in our projectNigel, Arun, and Matt volunteered to RM 22, 23 and 20x respectively.Are these people in fact not responsible for their release branches insvn, eg any committer is free to merge a patch from trunk into one ofthese branches?

For completeness, Chris proposed a while back [1] that the RM isselected by majority PMC approval. He never proposed a vote to adoptthese rules, but if we did, would they be invalid, ie according toother rules established at Apache? Ie does this project have theability to establish it's own rules/norms w/o you telling us how ourproject works?

The release policy - "Regarding what makes it into a release, the RMis the unquestioned authority" - seems to indicate "the RM" has finalsay about what makes it into a release. This implies that either therelease is just the RM's work (vs the work of the community) or infact the RM is not the unquestioned authority on a release.

I think there's a huge gap between the current understanding of the RMin our community and what you've outlined.

But did he do the lease recovery (for more info: HDFS-265) ? I haven seenthe initiation of lease recovery by Konst, but haven't seen acks.

- milind

On 9/9/11 2:13 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:

>KOnstantin has stepped forward a couple of days ago ;)>>On Fri, Sep 09, 2011 at 05:08PM, [EMAIL PROTECTED] wrote:>> Has anyone seen the RM for 0.22 lately ?>> >> - milind>> >> On 9/9/11 1:46 PM, "Roy T. Fielding" <[EMAIL PROTECTED]> wrote:>> >> >On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:>> >> The release manager - not the developers - are responsible for and>> >> have the final say as to what patches get merge to their branch.>> >>> >That is simply false. At no time is any individual responsible for>> >any part of Apache subversion, no matter how obscure the branch.>> >No RM has the final say on anything other than their own work.>> >That is, they can choose not to produce a release candidate.>> >Furthermore, at no time whatsoever does any person "own" the job>> >of being RM -- there can be five active RMs on a single branch,>> >each producing release candidates based on what is in subversion>> >at the particular time that they decide to tag and build.>> >>> >> If the RM wants all this work they need to either corral the>>developers>> >> to do the merging or do the merging themselves. In short, it's their>> >> responsibility to get people to invest in the branch.>> >>> >That is true of anyone, on or off the PMC -- not just the RM.>> >>> >....Roy>> >>> >>> >

On Fri, Sep 09, 2011 at 05:55PM, [EMAIL PROTECTED] wrote:> But did he do the lease recovery (for more info: HDFS-265) ? I haven seen> the initiation of lease recovery by Konst, but haven't seen acks.

I am not sure I follow you, Milind. HDFS_265 has been in since 0.21. Are youreferring to some new development that I might've missed?

Cos

> - milind > > On 9/9/11 2:13 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:> > >KOnstantin has stepped forward a couple of days ago ;)> >> >On Fri, Sep 09, 2011 at 05:08PM, [EMAIL PROTECTED] wrote:> >> Has anyone seen the RM for 0.22 lately ?> >> > >> - milind> >> > >> On 9/9/11 1:46 PM, "Roy T. Fielding" <[EMAIL PROTECTED]> wrote:> >> > >> >On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:> >> >> The release manager - not the developers - are responsible for and> >> >> have the final say as to what patches get merge to their branch.> >> >> >> >That is simply false. At no time is any individual responsible for> >> >any part of Apache subversion, no matter how obscure the branch.> >> >No RM has the final say on anything other than their own work.> >> >That is, they can choose not to produce a release candidate.> >> >Furthermore, at no time whatsoever does any person "own" the job> >> >of being RM -- there can be five active RMs on a single branch,> >> >each producing release candidates based on what is in subversion> >> >at the particular time that they decide to tag and build.> >> >> >> >> If the RM wants all this work they need to either corral the> >>developers> >> >> to do the merging or do the merging themselves. In short, it's their> >> >> responsibility to get people to invest in the branch.> >> >> >> >That is true of anyone, on or off the PMC -- not just the RM.> >> >> >> >....Roy> >> >> >> >> >> > >>

On Fri, Sep 9, 2011 at 2:58 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote:> On Fri, Sep 09, 2011 at 05:55PM, [EMAIL PROTECTED] wrote:>> But did he do the lease recovery (for more info: HDFS-265) ? I haven seen>> the initiation of lease recovery by Konst, but haven't seen acks.>> I am not sure I follow you, Milind. HDFS_265 has been in since 0.21. Are you> referring to some new development that I might've missed?>

Milind is saying that Konst volunteered but Nigel hasn't taken him up on it yet.

Never mind. It was an oblique reference to client not writing to a filefor a long time, so hdfs recovering the lease. (RM=client, release=file,recovery=transferring RM role. :-)

- milind

On 9/9/11 2:58 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:

>On Fri, Sep 09, 2011 at 05:55PM, [EMAIL PROTECTED] wrote:>> But did he do the lease recovery (for more info: HDFS-265) ? I haven>>seen>> the initiation of lease recovery by Konst, but haven't seen acks.>>I am not sure I follow you, Milind. HDFS_265 has been in since 0.21. Are>you>referring to some new development that I might've missed?>>Cos>>> - milind >> >> On 9/9/11 2:13 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:>> >> >KOnstantin has stepped forward a couple of days ago ;)>> >>> >On Fri, Sep 09, 2011 at 05:08PM, [EMAIL PROTECTED] wrote:>> >> Has anyone seen the RM for 0.22 lately ?>> >> >> >> - milind>> >> >> >> On 9/9/11 1:46 PM, "Roy T. Fielding" <[EMAIL PROTECTED]> wrote:>> >> >> >> >On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:>> >> >> The release manager - not the developers - are responsible for and>> >> >> have the final say as to what patches get merge to their branch.>> >> >>> >> >That is simply false. At no time is any individual responsible for>> >> >any part of Apache subversion, no matter how obscure the branch.>> >> >No RM has the final say on anything other than their own work.>> >> >That is, they can choose not to produce a release candidate.>> >> >Furthermore, at no time whatsoever does any person "own" the job>> >> >of being RM -- there can be five active RMs on a single branch,>> >> >each producing release candidates based on what is in subversion>> >> >at the particular time that they decide to tag and build.>> >> >>> >> >> If the RM wants all this work they need to either corral the>> >>developers>> >> >> to do the merging or do the merging themselves. In short, it's>>their>> >> >> responsibility to get people to invest in the branch.>> >> >>> >> >That is true of anyone, on or off the PMC -- not just the RM.>> >> >>> >> >....Roy>> >> >>> >> >>> >> >> >>> >

On Fri, Sep 09, 2011 at 06:17PM, [EMAIL PROTECTED] wrote:> Never mind. It was an oblique reference to client not writing to a file> for a long time, so hdfs recovering the lease. (RM=client, release=file,> recovery=transferring RM role. :-)

> - milind> > On 9/9/11 2:58 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:> > >On Fri, Sep 09, 2011 at 05:55PM, [EMAIL PROTECTED] wrote:> >> But did he do the lease recovery (for more info: HDFS-265) ? I haven> >>seen> >> the initiation of lease recovery by Konst, but haven't seen acks.> >> >I am not sure I follow you, Milind. HDFS_265 has been in since 0.21. Are> >you> >referring to some new development that I might've missed?> >> >Cos> >> >> - milind > >> > >> On 9/9/11 2:13 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:> >> > >> >KOnstantin has stepped forward a couple of days ago ;)> >> >> >> >On Fri, Sep 09, 2011 at 05:08PM, [EMAIL PROTECTED] wrote:> >> >> Has anyone seen the RM for 0.22 lately ?> >> >> > >> >> - milind> >> >> > >> >> On 9/9/11 1:46 PM, "Roy T. Fielding" <[EMAIL PROTECTED]> wrote:> >> >> > >> >> >On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:> >> >> >> The release manager - not the developers - are responsible for and> >> >> >> have the final say as to what patches get merge to their branch.> >> >> >> >> >> >That is simply false. At no time is any individual responsible for> >> >> >any part of Apache subversion, no matter how obscure the branch.> >> >> >No RM has the final say on anything other than their own work.> >> >> >That is, they can choose not to produce a release candidate.> >> >> >Furthermore, at no time whatsoever does any person "own" the job> >> >> >of being RM -- there can be five active RMs on a single branch,> >> >> >each producing release candidates based on what is in subversion> >> >> >at the particular time that they decide to tag and build.> >> >> >> >> >> >> If the RM wants all this work they need to either corral the> >> >>developers> >> >> >> to do the merging or do the merging themselves. In short, it's> >>their> >> >> >> responsibility to get people to invest in the branch.> >> >> >> >> >> >That is true of anyone, on or off the PMC -- not just the RM.> >> >> >> >> >> >....Roy> >> >> >> >> >> >> >> >> > >> >> >> > >>

On Fri, Sep 9, 2011 at 2:41 PM, Eli Collins <[EMAIL PROTECTED]> wrote:> For completeness, Chris proposed a while back [1] that the RM is> selected by majority PMC approval. He never proposed a vote to adopt> these rules, but if we did, would they be invalid, ie according to> other rules established at Apache? Ie does this project have the> ability to establish it's own rules/norms w/o you telling us how our> project works?>> 1. http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%[EMAIL PROTECTED]%3E

The intent was to evolve the policies we had at the time to an RM-likerole. As Tom raised in that thread, much of the intermediate phase isunnecessary. In the project's current state, much of that proposal isno longer coherent.

For instance, electing an RM is an unnecessary formality. Further,enforcing the version compatibility rules across the 0.20.2xx andvarious 0.2x branches is prohibitively expensive. As demonstrated byprevious experience, making prohibitively expensive rules motivatesexternal forks, not coherence. They're useful guidelines for what thePMC is likely to approve, but I expect we'll show more flexibilitythan what that proposal outlined.

In practice, we already exercise all the important elements of thatproposal. Someone creates a branch and intends to release from it,others may help them, and if an artifact is produced then the PMCvotes on whether to release it. Any debate, accusation, andrecrimination concerning "investing" in a branch is pointless becausethe "result" of the debate is irrelevant. Other than venting somespleen, this thread has no functional consequence.

On Fri, Sep 9, 2011 at 1:46 PM, Roy T. Fielding <[EMAIL PROTECTED]> wrote:>> No RM has the final say on anything other than their own work.

That's consistent with the position as forwarded. We're using "theRM's branch" as a shorthand for "whatever the RM has elected toinclude." Your comments highlight nuance, not fundamental dissonance.

> I think there's a huge gap between the current understanding of the RM> in our community and what you've outlined.

It's vanishingly small, but important. Ownership of a branch isn'tvested in an RM, neither is it transferrable. If someone wanted tocommit something to a branch, they aren't required to ask the RM. Now,it's *polite*, and I hope that most would give a heads-up for anythingthey were unsure of, but the repository isn't a hierarchy ofdictatorships. The source tree is lock-free. -C

> On Fri, Sep 9, 2011 at 2:41 PM, Eli Collins <[EMAIL PROTECTED]> wrote:>> For completeness, Chris proposed a while back [1] that the RM is>> selected by majority PMC approval. He never proposed a vote to adopt>> these rules, but if we did, would they be invalid, ie according to>> other rules established at Apache? Ie does this project have the>> ability to establish it's own rules/norms w/o you telling us how our>> project works?>> >> 1. http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%[EMAIL PROTECTED]%3E> > The intent was to evolve the policies we had at the time to an RM-like> role. As Tom raised in that thread, much of the intermediate phase is> unnecessary. In the project's current state, much of that proposal is> no longer coherent.> > For instance, electing an RM is an unnecessary formality. Further,> enforcing the version compatibility rules across the 0.20.2xx and> various 0.2x branches is prohibitively expensive. As demonstrated by> previous experience, making prohibitively expensive rules motivates> external forks, not coherence. They're useful guidelines for what the> PMC is likely to approve, but I expect we'll show more flexibility> than what that proposal outlined.> > In practice, we already exercise all the important elements of that> proposal. Someone creates a branch and intends to release from it,> others may help them, and if an artifact is produced then the PMC> votes on whether to release it. Any debate, accusation, and> recrimination concerning "investing" in a branch is pointless because> the "result" of the debate is irrelevant. Other than venting some> spleen, this thread has no functional consequence.> > On Fri, Sep 9, 2011 at 1:46 PM, Roy T. Fielding <[EMAIL PROTECTED]> wrote:>>> No RM has the final say on anything other than their own work.> > That's consistent with the position as forwarded. We're using "the> RM's branch" as a shorthand for "whatever the RM has elected to> include." Your comments highlight nuance, not fundamental dissonance.> >> I think there's a huge gap between the current understanding of the RM>> in our community and what you've outlined.> > It's vanishingly small, but important. Ownership of a branch isn't> vested in an RM, neither is it transferrable. If someone wanted to> commit something to a branch, they aren't required to ask the RM. Now,> it's *polite*, and I hope that most would give a heads-up for anything> they were unsure of, but the repository isn't a hierarchy of> dictatorships. The source tree is lock-free. -C++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++Chris Mattmann, Ph.D.Senior Computer ScientistNASA Jet Propulsion Laboratory Pasadena, CA 91109 USAOffice: 171-266B, Mailstop: 171-246Email: [EMAIL PROTECTED]WWW: http://sunset.usc.edu/~mattmann/++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++Adjunct Assistant Professor, Computer Science DepartmentUniversity of Southern California, Los Angeles, CA 90089 USA++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

On Fri, Sep 9, 2011 at 1:46 PM, Roy T. Fielding <[EMAIL PROTECTED]> wrote:> On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:>> The release manager - not the developers - are responsible for and>> have the final say as to what patches get merge to their branch.>> That is simply false. At no time is any individual responsible for> any part of Apache subversion, no matter how obscure the branch.> No RM has the final say on anything other than their own work.

It seems hard for an RM to be the final authority of what makes itinto a release w/o also being the final authority on what patches getmerged into the branch they're trying to drive a release from.

Thanks,Eli> That is, they can choose not to produce a release candidate.> Furthermore, at no time whatsoever does any person "own" the job> of being RM -- there can be five active RMs on a single branch,> each producing release candidates based on what is in subversion> at the particular time that they decide to tag and build.>>> If the RM wants all this work they need to either corral the developers>> to do the merging or do the merging themselves. In short, it's their>> responsibility to get people to invest in the branch.>> That is true of anyone, on or off the PMC -- not just the RM.>> ....Roy>>

On Fri, Sep 9, 2011 at 4:42 PM, Chris Douglas <[EMAIL PROTECTED]> wrote:> On Fri, Sep 9, 2011 at 2:41 PM, Eli Collins <[EMAIL PROTECTED]> wrote:>> For completeness, Chris proposed a while back [1] that the RM is>> selected by majority PMC approval. He never proposed a vote to adopt>> these rules, but if we did, would they be invalid, ie according to>> other rules established at Apache? Ie does this project have the>> ability to establish it's own rules/norms w/o you telling us how our>> project works?>>>> 1. http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%[EMAIL PROTECTED]%3E>> The intent was to evolve the policies we had at the time to an RM-like> role. As Tom raised in that thread, much of the intermediate phase is> unnecessary. In the project's current state, much of that proposal is> no longer coherent.>> For instance, electing an RM is an unnecessary formality. Further,> enforcing the version compatibility rules across the 0.20.2xx and> various 0.2x branches is prohibitively expensive. As demonstrated by> previous experience, making prohibitively expensive rules motivates> external forks, not coherence. They're useful guidelines for what the> PMC is likely to approve, but I expect we'll show more flexibility> than what that proposal outlined.>> In practice, we already exercise all the important elements of that> proposal. Someone creates a branch and intends to release from it,> others may help them, and if an artifact is produced then the PMC> votes on whether to release it. Any debate, accusation, and> recrimination concerning "investing" in a branch is pointless because> the "result" of the debate is irrelevant. Other than venting some> spleen, this thread has no functional consequence.>> On Fri, Sep 9, 2011 at 1:46 PM, Roy T. Fielding <[EMAIL PROTECTED]> wrote:>>> No RM has the final say on anything other than their own work.>> That's consistent with the position as forwarded. We're using "the> RM's branch" as a shorthand for "whatever the RM has elected to> include." Your comments highlight nuance, not fundamental dissonance.>>> I think there's a huge gap between the current understanding of the RM>> in our community and what you've outlined.>> It's vanishingly small, but important. Ownership of a branch isn't> vested in an RM, neither is it transferrable. If someone wanted to> commit something to a branch, they aren't required to ask the RM. Now,> it's *polite*, and I hope that most would give a heads-up for anything> they were unsure of, but the repository isn't a hierarchy of> dictatorships. The source tree is lock-free. -C>

I think the main point is that RMs correspond to releases, not arelease branch. This distinction makes more sense, eg I think it wouldbe good to have a set of people RM the dot releases off a given branchinstead of an RM for a branch and a series of releases from it (whatit seems like we've been doing).

> On Fri, Sep 9, 2011 at 1:46 PM, Roy T. Fielding <[EMAIL PROTECTED]> wrote:>> On Sep 9, 2011, at 11:08 AM, Eli Collins wrote:>>> The release manager - not the developers - are responsible for and>>> have the final say as to what patches get merge to their branch.>> >> That is simply false. At no time is any individual responsible for>> any part of Apache subversion, no matter how obscure the branch.> > This seems to contradict the Apache release policy> (http://httpd.apache.org/dev/release.html), and the practice adopted> by this project. Eg "Regarding what makes it into a release, the RM is> the unquestioned authority. No one can contest what makes it into the> release."

I think someone got carried away in their artistic flair. I wrote theoriginal httpd guidelines.

> If there is no individual responsible for a release branch then who is> "the RM" that is the unquestioned authority for a release?> > The page states that "there is no set RM" however in our project> Nigel, Arun, and Matt volunteered to RM 22, 23 and 20x respectively.

Volunteers are always needed. That doesn't mean they are the onlyvolunteers, nor does it mean they have special veto powers.

> Are these people in fact not responsible for their release branches in> svn, eg any committer is free to merge a patch from trunk into one of> these branches?

Any committer can commit wherever they have been given permissionto commit by the PMC. Generally, they do so collaboratively.I've never encountered a situation in my own projects which developerswere committing at cross-purposes, even when they disagree on content,though I've seen commit wars elsewhere. We'd expect the PMCto step in if they did.

The only thing the RM has authority over is the building of a sourcepackage, based on the contents of our subversion, that can then beput up for vote. They can decide what snapshot to tag for a build.They can decide not to build anything at all. They can also do all sortsof organizational support, advocacy, pleading, or whatever in order toencourage the rest of the project committers to apply changes, votefor things under issue, etc.

They do not have the right to pick and select whatever variationof the product they might like to build, short of vetoing (with avalid reason) any changes that they as a PMC member believe do notbelong on the branch. Likewise, the RM cannot include in the buildany change that has been vetoed by others, and their build cannotbe released if it contains any such changes that have been vetoedsince it was built. The RM has the right to kill their own buildif they learn something during the release process that they think,for whatever reason, causes the build to be unreleasable. But theRM can't stop anyone else on the PMC from taking the same buildand calling for its release under their own management as RM.

> For completeness, Chris proposed a while back [1] that the RM is> selected by majority PMC approval. He never proposed a vote to adopt> these rules, but if we did, would they be invalid, ie according to> other rules established at Apache? Ie does this project have the> ability to establish it's own rules/norms w/o you telling us how our> project works?> > 1. http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%[EMAIL PROTECTED]%3E

The ASF supports collaborative development of open source throughthe work of individual volunteers. If the rules are consistent withthat mission and are applied consistently, then the board is unlikelyto intervene.

As a board member, what I look for is the effect that those ruleshave on collaboration. If I see a bunch of happy developerscollaborating on releases, it really doesn't matter what the rulesare since the rules exist to prevent technical disagreements fromescalating into social conflicts. If, however, I see no significantreleases, work being done elsewhere, or the project split intomultiple branches that happen to match corporate fiefdoms, thenit is my responsibility to do something to get it back to beinga collaboration. Sometimes that means we change the rules.

> Nigel,> > your comment can be understood as a request to commit important fixes> to the 0.22 branch. I agree with that. But if you choose to abandon> the RM role I will volunteer to take it over.> > Thanks,> --Konstantin

[catching up on the week's email]

Konstantin, you are free to carry on this 0.22 work and call a release vote if you wish.

To me it is becoming clearer every week that 0.22 continues to regress in significant ways from 0.20.20x releases which will cause further confusion in our user base. Thus I will no longer manage a release from the 0.22 branch. When 0.20.200 was proposed, I thought this would be the only such release and regressions to 0.22 would be manageable. This is no longer the case now that there have been many 0.20.20x releases with significant changes and committers have not been merging these changes to intervening branches as was once common practice.

At the current time, I believe the most value that we (the developer community) have derived from 0.22 was finding/fixing issues throughout the spring timeframe that also existed on trunk. Many thanks to those that ran early versions of 0.22, filled issues and provided fixes. Trunk, and thus 0.23, are better for it.

Cheers,NigeOn Sep 7, 2011, at 2:04 AM, Konstantin Shvachko wrote:

> Eric,> > It would take the same amount of resources to fix 0.22 as to merge> append and security branches aka 0.20.205.> Although I understand that Hortonworks needs to support its> customer(s) and is eager to bridge the gap in functionality with its> competitor(s), I think continuing with 0.20> a-three-years-old-technology is not the best place to invest> resources. In the past you advocated for 0.21 and 0.22, both now> abandoned by your team(s) in favor of enhancing 0.20. It will be sad> to see this backward/forward porting going on forever, diverging the> Apache Hadoop project from natural evolutionary process.> > I think 0.22 has all the functionality required to run Hadoop for most> production tasks. I see enough momentum and involvement in the> community with 0.22 testing. I think there will be enough resources to> get it stabilized in near future.> > Nigel,> > your comment can be understood as a request to commit important fixes> to the 0.22 branch. I agree with that. But if you choose to abandon> the RM role I will volunteer to take it over.> > Thanks,> --Konstantin> > On Tue, Sep 6, 2011 at 4:06 AM, Eric Baldeschwieler> <[EMAIL PROTECTED]> wrote:>> >> What do others think?>> >> IMO 22 is not the best place to invest resources. I support nigel's suggestion of abandoning it, but people are free to work on what they are passionate about.>> >> E14>> >> On Sep 3, 2011, at 11:11 AM, Nigel Daley wrote:>> >>> Matt, Others,>>> >>> Is the expectation that these fixes go into other release branches too (including 0.22, 0.23) if applicable?>>> >>> If not, my concern is that 0.22 is regressing further from the popular Apache release and I'm inclined to abandon 0.22. Thoughts?>>> >>> Cheers,>>> Nige>>

On 12/09/11 05:52, Nigel Daley wrote:>> Nigel,>>>> your comment can be understood as a request to commit important fixes>> to the 0.22 branch. I agree with that. But if you choose to abandon>> the RM role I will volunteer to take it over.>>>> Thanks,>> --Konstantin>> [catching up on the week's email]>> Konstantin, you are free to carry on this 0.22 work and call a release vote if you wish.>> To me it is becoming clearer every week that 0.22 continues to regress in significant ways from 0.20.20x releases which will cause further confusion in our user base. Thus I will no longer manage a release from the 0.22 branch. When 0.20.200 was proposed, I thought this would be the only such release and regressions to 0.22 would be manageable. This is no longer the case now that there have been many 0.20.20x releases with significant changes and committers have not been merging these changes to intervening branches as was once common practice.>> At the current time, I believe the most value that we (the developer community) have derived from 0.22 was finding/fixing issues throughout the spring timeframe that also existed on trunk. Many thanks to those that ran early versions of 0.22, filled issues and provided fixes. Trunk, and thus 0.23, are better for it.>

Well, if that's the case I'm going to start pushing some existing and new stuff into 0.20.x

> Many thanks to those that ran early versions of 0.22, filled issues and provided fixes.

This is one of the three reasons I think 0.22 release is important.1. A lot of community work has been put into the branch. Developerswere developing, committers were committing, and I heard manyenthusiastic people participated in hackathons. It would be such awaste.2. We do not have an Apache release supporting HBase. 0.20.205 iscontemplated to be one. But it will not allow mixed workload. Same aswith other append-based Hadoop versions people will have to usededicated HBase clusters separate from "general purpose" clusters.0.22 is a better choice in this respect.3. Timing. With 0.23 being de facto a rewrite of Hadoop - both of MRand HDFS - its stabilization may take longer than anticipated. Ibelieve 0.22 can be released fairly soon.

I was pleased with recent testing of the 0.22 branch. I plan to set upa 100-node cluster to test the build with other Hadoop components nextweek.

I will very much appreciate any help from the community as I don'thave an army to marshal.

Thanks,--Konstantin

On Sun, Sep 11, 2011 at 9:52 PM, Nigel Daley <[EMAIL PROTECTED]> wrote:>> Nigel,>>>> your comment can be understood as a request to commit important fixes>> to the 0.22 branch. I agree with that. But if you choose to abandon>> the RM role I will volunteer to take it over.>>>> Thanks,>> --Konstantin>> [catching up on the week's email]>> Konstantin, you are free to carry on this 0.22 work and call a release vote if you wish.>> To me it is becoming clearer every week that 0.22 continues to regress in significant ways from 0.20.20x releases which will cause further confusion in our user base. Thus I will no longer manage a release from the 0.22 branch. When 0.20.200 was proposed, I thought this would be the only such release and regressions to 0.22 would be manageable. This is no longer the case now that there have been many 0.20.20x releases with significant changes and committers have not been merging these changes to intervening branches as was once common practice.>> At the current time, I believe the most value that we (the developer community) have derived from 0.22 was finding/fixing issues throughout the spring timeframe that also existed on trunk. Many thanks to those that ran early versions of 0.22, filled issues and provided fixes. Trunk, and thus 0.23, are better for it.>> Cheers,> Nige>>> On Sep 7, 2011, at 2:04 AM, Konstantin Shvachko wrote:>>> Eric,>>>> It would take the same amount of resources to fix 0.22 as to merge>> append and security branches aka 0.20.205.>> Although I understand that Hortonworks needs to support its>> customer(s) and is eager to bridge the gap in functionality with its>> competitor(s), I think continuing with 0.20>> a-three-years-old-technology is not the best place to invest>> resources. In the past you advocated for 0.21 and 0.22, both now>> abandoned by your team(s) in favor of enhancing 0.20. It will be sad>> to see this backward/forward porting going on forever, diverging the>> Apache Hadoop project from natural evolutionary process.>>>> I think 0.22 has all the functionality required to run Hadoop for most>> production tasks. I see enough momentum and involvement in the>> community with 0.22 testing. I think there will be enough resources to>> get it stabilized in near future.>>>> Nigel,>>>> your comment can be understood as a request to commit important fixes>> to the 0.22 branch. I agree with that. But if you choose to abandon>> the RM role I will volunteer to take it over.>>>> Thanks,>> --Konstantin>>>> On Tue, Sep 6, 2011 at 4:06 AM, Eric Baldeschwieler>> <[EMAIL PROTECTED]> wrote:>>>>>> What do others think?>>>>>> IMO 22 is not the best place to invest resources. I support nigel's suggestion of abandoning it, but people are free to work on what they are passionate about.>>>>>> E14>>>>>> On Sep 3, 2011, at 11:11 AM, Nigel Daley wrote:

> I benchmarked 22 v/s 20.xxx there was >5x difference and that was more than a year ago.

Arun do you mind sharing the benchmarks that you ran?

Thanks,--Konstatnin

On Fri, Sep 9, 2011 at 11:26 AM, Arun C Murthy <[EMAIL PROTECTED]> wrote:> Joep,>> On Sep 9, 2011, at 10:41 AM, Rottinghuis, Joep wrote:>>> No one is going to block you from doing any work you want.>> All that is required is to have the work in trunk and subsequent branches i.e 0.23 (as applicable) .>> The problem is that 0.22 hasn't seen major movement for almost a year since it was branched and there is no incentive for lots of people to contribute - plus there is MRv2 which is completely different beast (see the discussion that Eli pointed to).>> None of this is to say you shouldn't contribute or use 0.22, you move the project as you wish by your contributions.>>>> MR1 is being maintained on 20x. In fact 20x is the only MR1 code that supports security and disk failure handling. The MR1 code in 22 is a regression in some significant aspects (features, performance, bugs) from the latest stable MR1 (204).>> ...>>>> Eli, aside from the disk failure handling which is a new feature in 205 (not present in earlier 20x releases), could you please elaborate on which other significant aspects 22 would regress from 20x?>>>> I've talked to Konstantin about this.>> There is tonnes of performance work missing, including scaling work on JobTracker, CapacityScheduler etc. There is work to add a ton of limits (counters, tasks, etc. etc.). Then there is operability work such as JobHistory, handling logs etc. The last time I benchmarked 22 v/s 20.xxx there was >5x difference and that was more than a year ago. Arguably some of the operability work won't matter for small clusters, but you are welcome to make your own decisions.>> It's unfortunate we have landed here, but 22 branched almost a year ago and hence none of this work was ported there since a branch implies critical bugs was supposed to go in. That plus the problems with scaling MR1 which led to investment in MR1 is where we are. As a result, there is no enthusiasm to contribute to MR1 from vast majority of devs given that we've decided we won't support it.>>>>> My understanding of the way Apache operates is that you can't do things like "declare blocks on upgrade paths". People can try to release updates to 21 or 22 (or some new tree). Ie the decisions are made implicitly by where people invest cycles.>>>> If a group of committers say that they'll commit to trunk, to 0.23 and to 0.20x, but not to 0.22, then in effect that is like to "declare a block on upgrade path" isn't it? The more such commits that go in to other branches, but not the ones in between essentially is a declaration of a block, because of the very regression argument you make.>>>> I agree that nobody can be made to contribute to something they don't want, but does that result into a split?>> In other words, if a significant bug fix or feature goes into trunk and into 22, can developers then simply say: "I'm not interested to put this into 23, you do this yourself if you want?". Will that be tolerated or vetoed?>>>> Again, no one is going to veto anything. As Eli said decisions are make by people's code.>> Arun>>

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext