As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.

We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects.

> Folks,> > As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.> > CURRENT SVN REPO:> > hadoop / [common, mapreduce, hdfs] / trunk> hadoop / [common, mapreduce, hdfs] / branches> > PROPOSAL:> > hadoop / trunk / [common, mapreduce, hdfs]> hadoop / branches / [common, mapreduce, hdfs]> > We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects. > > Thoughts?> > Cheers,> Nige

> +1>> Death to the project split! Or short of that, anything to tame it.>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>> > Folks,> >> > As I look more at the impact of the common/MR/HDFS project split on what> and how we release Hadoop, I feel like the split needs an adjustment. Many> folks I've talked to agree that the project split has caused us a splitting> headache. I think 1 relatively small change could alleviate some of that.> >> > CURRENT SVN REPO:> >> > hadoop / [common, mapreduce, hdfs] / trunk> > hadoop / [common, mapreduce, hdfs] / branches> >> > PROPOSAL:> >> > hadoop / trunk / [common, mapreduce, hdfs]> > hadoop / branches / [common, mapreduce, hdfs]> >> > We're a long way from releasing these 3 projects independently. Given> that, they should be branched and released as a unit. This SVN structure> enforces that and provides a more natural place to keep a top level build> and pkg scripts that operate across all 3 projects.> >> > Thoughts?> >> > Cheers,> > Nige>>-- Todd LipconSoftware Engineer, Cloudera

> +1> > Death to the project split! Or short of that, anything to tame it.> > On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:> >> Folks,>> >> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>> >> CURRENT SVN REPO:>> >> hadoop / [common, mapreduce, hdfs] / trunk>> hadoop / [common, mapreduce, hdfs] / branches>> >> PROPOSAL:>> >> hadoop / trunk / [common, mapreduce, hdfs]>> hadoop / branches / [common, mapreduce, hdfs]>> >> We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects. >> >> Thoughts?>> >> Cheers,>> Nige>

On Fri, Jan 14, 2011 at 12:32 AM, Ian Holsman <[EMAIL PROTECTED]> wrote:> +1 full agreement.>> I think it will be a pita admin wise (due to how svn authorization is set up), so it might slow down creation of a new branch, but its worth it.>> ---> Ian Holsman> AOL Inc> [EMAIL PROTECTED]> (703) 879-3128 / AIM:ianholsman>> it's just a technicality>> On Jan 14, 2011, at 2:25 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>>> +1>>>> Death to the project split! Or short of that, anything to tame it.>>>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>>>>> Folks,>>>>>> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>>>>>> CURRENT SVN REPO:>>>>>> hadoop / [common, mapreduce, hdfs] / trunk>>> hadoop / [common, mapreduce, hdfs] / branches>>>>>> PROPOSAL:>>>>>> hadoop / trunk / [common, mapreduce, hdfs]>>> hadoop / branches / [common, mapreduce, hdfs]>>>>>> We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects.>>>>>> Thoughts?>>>>>> Cheers,>>> Nige>>>

on that note... I propose we discuss un-splitting the project altogether.

On Jan 14, 2011, at 3:39 AM, Jakob Homan wrote:

> +1. The project split is a lie.> > On Fri, Jan 14, 2011 at 12:32 AM, Ian Holsman <[EMAIL PROTECTED]> wrote:>> +1 full agreement.>> >> I think it will be a pita admin wise (due to how svn authorization is set up), so it might slow down creation of a new branch, but its worth it.>> >> --->> Ian Holsman>> AOL Inc>> [EMAIL PROTECTED]>> (703) 879-3128 / AIM:ianholsman>> >> it's just a technicality>> >> On Jan 14, 2011, at 2:25 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>> >>> +1>>> >>> Death to the project split! Or short of that, anything to tame it.>>> >>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>>> >>>> Folks,>>>> >>>> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>>>> >>>> CURRENT SVN REPO:>>>> >>>> hadoop / [common, mapreduce, hdfs] / trunk>>>> hadoop / [common, mapreduce, hdfs] / branches>>>> >>>> PROPOSAL:>>>> >>>> hadoop / trunk / [common, mapreduce, hdfs]>>>> hadoop / branches / [common, mapreduce, hdfs]>>>> >>>> We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects.>>>> >>>> Thoughts?>>>> >>>> Cheers,>>>> Nige>>> >>

> Folks,>> As I look more at the impact of the common/MR/HDFS project split on > what and how we release Hadoop, I feel like the split needs an > adjustment. Many folks I've talked to agree that the project split > has caused us a splitting headache. I think 1 relatively small > change could alleviate some of that.>> CURRENT SVN REPO:>> hadoop / [common, mapreduce, hdfs] / trunk> hadoop / [common, mapreduce, hdfs] / branches>> PROPOSAL:>> hadoop / trunk / [common, mapreduce, hdfs]> hadoop / branches / [common, mapreduce, hdfs]Moving the source trees back together is ok, but will cause a fair amount of churn for those of us that depend on the git versions of the repository. Using Todd's hack may be able to fix it again at least for each individual user.

I assume you meant to propose:

hadoop/ {trunk, branches/*, tags/* } / {common, hdfs, mapreduce}

which means that you can make checkouts, branches and tags with a single command. Your proposal as stated would break all of the tools that count on standard layouts of subversion repositories, such as the subversion to git gateways and eclipse.

We currently have other stuff at the top level of hadoop: hive, logos, nightly, pig, site, and zookeeper. Clearly hive, pig, and zookeeper should be removed. The others are just versioned and aren't branched. I'm fine with leaving them at the top level as "extra" bits, but it should be decided.

>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>> Folks,>>>> As I look more at the impact of the common/MR/HDFS project split on what>> and how we release Hadoop, I feel like the split needs an adjustment. Many>> folks I've talked to agree that the project split has caused us a splitting>> headache. I think 1 relatively small change could alleviate some of that.>>>> CURRENT SVN REPO:>>>> hadoop / [common, mapreduce, hdfs] / trunk>> hadoop / [common, mapreduce, hdfs] / branches>>>> PROPOSAL:>>>> hadoop / trunk / [common, mapreduce, hdfs]>> hadoop / branches / [common, mapreduce, hdfs]>>>>> Moving the source trees back together is ok, but will cause a fair amount> of churn for those of us that depend on the git versions of the repository.> Using Todd's hack may be able to fix it again at least for each individual> user.>

Yep, I think we can set this up in a reasonable way with grafts. I'm happyto write a little shell script we can put on the wiki for git users thatwould make the history look sane.

Depending on how the git mirrors work, we might even be able to get theofficial git.apache.org one to work, too. If we decide to go forward withthis plan I'll talk to the relevant INFRA folks and see about that.

I'm a huge supporter of the idea. On a related note, we've been looking for the right time to mavenize. Maybe we can do both together. We could pitch in a bunch of work on both if we could get the timing right.

We've got a huge batch of commits in flight now, but if we can find something that satisfied the 22 crowd after we sync in, we'd be happy to pitch in on unsplit and/or maven.

---E14 - via iPhone

On Jan 14, 2011, at 5:43 AM, "Ian Holsman" <[EMAIL PROTECTED]> wrote:

> on that note... I propose we discuss un-splitting the project altogether.> > On Jan 14, 2011, at 3:39 AM, Jakob Homan wrote:> >> +1. The project split is a lie.>> >> On Fri, Jan 14, 2011 at 12:32 AM, Ian Holsman <[EMAIL PROTECTED]> wrote:>>> +1 full agreement.>>> >>> I think it will be a pita admin wise (due to how svn authorization is set up), so it might slow down creation of a new branch, but its worth it.>>> >>> --->>> Ian Holsman>>> AOL Inc>>> [EMAIL PROTECTED]>>> (703) 879-3128 / AIM:ianholsman>>> >>> it's just a technicality>>> >>> On Jan 14, 2011, at 2:25 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>>> >>>> +1>>>> >>>> Death to the project split! Or short of that, anything to tame it.>>>> >>>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>>>> >>>>> Folks,>>>>> >>>>> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>>>>> >>>>> CURRENT SVN REPO:>>>>> >>>>> hadoop / [common, mapreduce, hdfs] / trunk>>>>> hadoop / [common, mapreduce, hdfs] / branches>>>>> >>>>> PROPOSAL:>>>>> >>>>> hadoop / trunk / [common, mapreduce, hdfs]>>>>> hadoop / branches / [common, mapreduce, hdfs]>>>>> >>>>> We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects.>>>>> >>>>> Thoughts?>>>>> >>>>> Cheers,>>>>> Nige>>>> >>> >

> On Fri, Jan 14, 2011 at 8:51 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote:> >> >> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>> >> Folks,>>> >>> As I look more at the impact of the common/MR/HDFS project split on what>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>> folks I've talked to agree that the project split has caused us a splitting>>> headache. I think 1 relatively small change could alleviate some of that.>>> >>> CURRENT SVN REPO:>>> >>> hadoop / [common, mapreduce, hdfs] / trunk>>> hadoop / [common, mapreduce, hdfs] / branches>>> >>> PROPOSAL:>>> >>> hadoop / trunk / [common, mapreduce, hdfs]>>> hadoop / branches / [common, mapreduce, hdfs]>>> >> >> >> Moving the source trees back together is ok, but will cause a fair amount>> of churn for those of us that depend on the git versions of the repository.>> Using Todd's hack may be able to fix it again at least for each individual>> user.>> > > Yep, I think we can set this up in a reasonable way with grafts. I'm happy> to write a little shell script we can put on the wiki for git users that> would make the history look sane.> > Depending on how the git mirrors work, we might even be able to get the> official git.apache.org one to work, too. If we decide to go forward with> this plan I'll talk to the relevant INFRA folks and see about that.> > -Todd> -- > Todd Lipcon> Software Engineer, Cloudera

The current layout means that every svn operation made during arelease has to be carried out three times which increases the amountof work and the chances of a mistake.

CheersTom

On Fri, Jan 14, 2011 at 9:17 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:> On Fri, Jan 14, 2011 at 8:51 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote:>>>>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>>>> Folks,>>>>>> As I look more at the impact of the common/MR/HDFS project split on what>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>> folks I've talked to agree that the project split has caused us a splitting>>> headache. I think 1 relatively small change could alleviate some of that.>>>>>> CURRENT SVN REPO:>>>>>> hadoop / [common, mapreduce, hdfs] / trunk>>> hadoop / [common, mapreduce, hdfs] / branches>>>>>> PROPOSAL:>>>>>> hadoop / trunk / [common, mapreduce, hdfs]>>> hadoop / branches / [common, mapreduce, hdfs]>>>>>>>>> Moving the source trees back together is ok, but will cause a fair amount>> of churn for those of us that depend on the git versions of the repository.>> Using Todd's hack may be able to fix it again at least for each individual>> user.>>>> Yep, I think we can set this up in a reasonable way with grafts. I'm happy> to write a little shell script we can put on the wiki for git users that> would make the history look sane.>> Depending on how the git mirrors work, we might even be able to get the> official git.apache.org one to work, too. If we decide to go forward with> this plan I'll talk to the relevant INFRA folks and see about that.>> -Todd> --> Todd Lipcon> Software Engineer, Cloudera>

On Fri, Jan 14, 2011 at 09:32, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:> I'm a huge supporter of the idea. On a related note, we've been looking for the right time to mavenize. Maybe we can do both together. We could pitch in a bunch of work on both if we could get the timing right.

Adding maveninzation - a great effort by itself with a lot ofpotential pitfalls - into the splitting frenzy has a good chances tobe pretty disruptive for individual developers. I well remember howpeople were screaming because virtually nothing worked in past-splittime and they had to invent and learn new tricks to merely do theirwork.

Being a great tool Maven has a relatively steep learning curve whichwon't ease the fact the projects layout have changed again andeveryone will have to adjust their workbenches to address that fact.

I suggest these two events are better be well separated in time. Cos

> We've got a huge batch of commits in flight now, but if we can find something that satisfied the 22 crowd after we sync in, we'd be happy to pitch in on unsplit and/or maven.>> ---> E14 - via iPhone>> On Jan 14, 2011, at 5:43 AM, "Ian Holsman" <[EMAIL PROTECTED]> wrote:>>> on that note... I propose we discuss un-splitting the project altogether.>>>> On Jan 14, 2011, at 3:39 AM, Jakob Homan wrote:>>>>> +1. The project split is a lie.>>>>>> On Fri, Jan 14, 2011 at 12:32 AM, Ian Holsman <[EMAIL PROTECTED]> wrote:>>>> +1 full agreement.>>>>>>>> I think it will be a pita admin wise (due to how svn authorization is set up), so it might slow down creation of a new branch, but its worth it.>>>>>>>> --->>>> Ian Holsman>>>> AOL Inc>>>> [EMAIL PROTECTED]>>>> (703) 879-3128 / AIM:ianholsman>>>>>>>> it's just a technicality>>>>>>>> On Jan 14, 2011, at 2:25 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>>>>>>>>> +1>>>>>>>>>> Death to the project split! Or short of that, anything to tame it.>>>>>>>>>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>>>>>>>>>>> Folks,>>>>>>>>>>>> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>>>>>>>>>>>> CURRENT SVN REPO:>>>>>>>>>>>> hadoop / [common, mapreduce, hdfs] / trunk>>>>>> hadoop / [common, mapreduce, hdfs] / branches>>>>>>>>>>>> PROPOSAL:>>>>>>>>>>>> hadoop / trunk / [common, mapreduce, hdfs]>>>>>> hadoop / branches / [common, mapreduce, hdfs]>>>>>>>>>>>> We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects.>>>>>>>>>>>> Thoughts?>>>>>>>>>>>> Cheers,>>>>>> Nige>>>>>>>>>>>>

> > On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:> >> Folks,>> >> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>> >> CURRENT SVN REPO:>> >> hadoop / [common, mapreduce, hdfs] / trunk>> hadoop / [common, mapreduce, hdfs] / branches>> >> PROPOSAL:>> >> hadoop / trunk / [common, mapreduce, hdfs]>> hadoop / branches / [common, mapreduce, hdfs]> > > Moving the source trees back together is ok, but will cause a fair amount of churn for those of us that depend on the git versions of the repository. Using Todd's hack may be able to fix it again at least for each individual user.

> which means that you can make checkouts, branches and tags with a single command. Your proposal as stated would break all of the tools that count on standard layouts of subversion repositories, such as the subversion to git gateways and eclipse.

I'm not much of a git user.1) subversion to git gateway: break unrepairably? 2) eclipse: break in what way? I use eclipse for other projects structure like this without a problem.

> We currently have other stuff at the top level of hadoop: hive, logos, nightly, pig, site, and zookeeper. Clearly hive, pig, and zookeeper should be removed. The others are just versioned and aren't branched. I'm fine with leaving them at the top level as "extra" bits, but it should be decided.

+1 to removing/leaving those extra bits as you propose. We can always move the remainder later if there is a reason to.

Thanks for the offer Eric! I agree it's the right time to mavenize, but I think we should separate, but order, these two discussions/events. This first, then mavenization.

Cheers,Nige

On Jan 14, 2011, at 9:32 AM, Eric Baldeschwieler wrote:

> I'm a huge supporter of the idea. On a related note, we've been looking for the right time to mavenize. Maybe we can do both together. We could pitch in a bunch of work on both if we could get the timing right. > > We've got a huge batch of commits in flight now, but if we can find something that satisfied the 22 crowd after we sync in, we'd be happy to pitch in on unsplit and/or maven. > > ---> E14 - via iPhone> > On Jan 14, 2011, at 5:43 AM, "Ian Holsman" <[EMAIL PROTECTED]> wrote:> >> on that note... I propose we discuss un-splitting the project altogether.>> >> On Jan 14, 2011, at 3:39 AM, Jakob Homan wrote:>> >>> +1. The project split is a lie.>>> >>> On Fri, Jan 14, 2011 at 12:32 AM, Ian Holsman <[EMAIL PROTECTED]> wrote:>>>> +1 full agreement.>>>> >>>> I think it will be a pita admin wise (due to how svn authorization is set up), so it might slow down creation of a new branch, but its worth it.>>>> >>>> --->>>> Ian Holsman>>>> AOL Inc>>>> [EMAIL PROTECTED]>>>> (703) 879-3128 / AIM:ianholsman>>>> >>>> it's just a technicality>>>> >>>> On Jan 14, 2011, at 2:25 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>>>> >>>>> +1>>>>> >>>>> Death to the project split! Or short of that, anything to tame it.>>>>> >>>>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>>>>> >>>>>> Folks,>>>>>> >>>>>> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>>>>>> >>>>>> CURRENT SVN REPO:>>>>>> >>>>>> hadoop / [common, mapreduce, hdfs] / trunk>>>>>> hadoop / [common, mapreduce, hdfs] / branches>>>>>> >>>>>> PROPOSAL:>>>>>> >>>>>> hadoop / trunk / [common, mapreduce, hdfs]>>>>>> hadoop / branches / [common, mapreduce, hdfs]>>>>>> >>>>>> We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects.>>>>>> >>>>>> Thoughts?>>>>>> >>>>>> Cheers,>>>>>> Nige>>>>> >>>> >>

> Thanks for the offer Eric! I agree it's the right time to mavenize, but I think we should separate, but order, these two discussions/events. This first, then mavenization.> > Cheers,> Nige> > On Jan 14, 2011, at 9:32 AM, Eric Baldeschwieler wrote:> >> I'm a huge supporter of the idea. On a related note, we've been looking for the right time to mavenize. Maybe we can do both together. We could pitch in a bunch of work on both if we could get the timing right. >> >> We've got a huge batch of commits in flight now, but if we can find something that satisfied the 22 crowd after we sync in, we'd be happy to pitch in on unsplit and/or maven. >> >> --->> E14 - via iPhone>> >> On Jan 14, 2011, at 5:43 AM, "Ian Holsman" <[EMAIL PROTECTED]> wrote:>> >>> on that note... I propose we discuss un-splitting the project altogether.>>> >>> On Jan 14, 2011, at 3:39 AM, Jakob Homan wrote:>>> >>>> +1. The project split is a lie.>>>> >>>> On Fri, Jan 14, 2011 at 12:32 AM, Ian Holsman <[EMAIL PROTECTED]> wrote:>>>>> +1 full agreement.>>>>> >>>>> I think it will be a pita admin wise (due to how svn authorization is set up), so it might slow down creation of a new branch, but its worth it.>>>>> >>>>> --->>>>> Ian Holsman>>>>> AOL Inc>>>>> [EMAIL PROTECTED]>>>>> (703) 879-3128 / AIM:ianholsman>>>>> >>>>> it's just a technicality>>>>> >>>>> On Jan 14, 2011, at 2:25 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>>>>> >>>>>> +1>>>>>> >>>>>> Death to the project split! Or short of that, anything to tame it.>>>>>> >>>>>> On Jan 13, 2011, at 10:18 PM, Nigel Daley wrote:>>>>>> >>>>>>> Folks,>>>>>>> >>>>>>> As I look more at the impact of the common/MR/HDFS project split on what and how we release Hadoop, I feel like the split needs an adjustment. Many folks I've talked to agree that the project split has caused us a splitting headache. I think 1 relatively small change could alleviate some of that.>>>>>>> >>>>>>> CURRENT SVN REPO:>>>>>>> >>>>>>> hadoop / [common, mapreduce, hdfs] / trunk>>>>>>> hadoop / [common, mapreduce, hdfs] / branches>>>>>>> >>>>>>> PROPOSAL:>>>>>>> >>>>>>> hadoop / trunk / [common, mapreduce, hdfs]>>>>>>> hadoop / branches / [common, mapreduce, hdfs]>>>>>>> >>>>>>> We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit. This SVN structure enforces that and provides a more natural place to keep a top level build and pkg scripts that operate across all 3 projects.>>>>>>> >>>>>>> Thoughts?>>>>>>> >>>>>>> Cheers,>>>>>>> Nige>>>>>> >>>>> >>> >

> Folks,>> As I look more at the impact of the common/MR/HDFS project split on what> and how we release Hadoop, I feel like the split needs an adjustment. Many> folks I've talked to agree that the project split has caused us a splitting> headache. I think 1 relatively small change could alleviate some of that.>> CURRENT SVN REPO:>> hadoop / [common, mapreduce, hdfs] / trunk> hadoop / [common, mapreduce, hdfs] / branches>> PROPOSAL:>> hadoop / trunk / [common, mapreduce, hdfs]> hadoop / branches / [common, mapreduce, hdfs]>> We're a long way from releasing these 3 projects independently. Given> that, they should be branched and released as a unit. This SVN structure> enforces that and provides a more natural place to keep a top level build> and pkg scripts that operate across all 3 projects.>> Thoughts?>> Cheers,> Nige>

> We actually still haven't recovered from the projects split.> We are still fixing HDFS and MR scripts with several jiras open.

Great, so let's do this reorg before we fix those jira's so we don't need to fix them again. Can you provide the issue numbers you're think of?

Thx,Nige

> On Thu, Jan 13, 2011 at 10:18 PM, Nigel Daley <[EMAIL PROTECTED]> wrote:> >> Folks,>> >> As I look more at the impact of the common/MR/HDFS project split on what>> and how we release Hadoop, I feel like the split needs an adjustment. Many>> folks I've talked to agree that the project split has caused us a splitting>> headache. I think 1 relatively small change could alleviate some of that.>> >> CURRENT SVN REPO:>> >> hadoop / [common, mapreduce, hdfs] / trunk>> hadoop / [common, mapreduce, hdfs] / branches>> >> PROPOSAL:>> >> hadoop / trunk / [common, mapreduce, hdfs]>> hadoop / branches / [common, mapreduce, hdfs]>> >> We're a long way from releasing these 3 projects independently. Given>> that, they should be branched and released as a unit. This SVN structure>> enforces that and provides a more natural place to keep a top level build>> and pkg scripts that operate across all 3 projects.>> >> Thoughts?>> >> Cheers,>> Nige>>

> As I look more at the impact of the common/MR/HDFS project split on what> and how we release Hadoop, I feel like the split needs an adjustment. Many> folks I've talked to agree that the project split has caused us a splitting> headache. I think 1 relatively small change could alleviate some of that.

Could you elaborate your idea on how the proposed changes would help? What the problems are being addressed? It is not clear to me.

You are right that the change is small but the impact is huge. We should first understand what we are getting from the changes before doing it.

> Hi Nigel,> >> As I look more at the impact of the common/MR/HDFS project split on what>> and how we release Hadoop, I feel like the split needs an adjustment. Many>> folks I've talked to agree that the project split has caused us a splitting>> headache. I think 1 relatively small change could alleviate some of that.> > Could you elaborate your idea on how the proposed changes would help? What the > problems are being addressed? It is not clear to me.

Critical in my mind was my statement: "We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit." This can not be enforced given the current svn layout. Other's can weigh in with additional thoughts.

> You are right that the change is small but the impact is huge. We should first > understand what we are getting from the changes before doing it.

Well this will generate tens of more jiras, which wont justify closing thefew remaining.You have been there, it took months to get that thing settled so it wasusable.I am just saying its a risk we can get the same this time.--Konstantin

This is a kind of an incompatible change: all the developers, QAs, release engineers and users have to change their local settings and scripts for this change. Moreover, there are documentations, web pages and existing tools using the Apache svn URLs. So it is a huge impact. I am conservative on this since, as Konstantin mentioned, we risk to get into the same mess, and it will create more work for the community.

Why do we want to enforce the releases as a unit, given that the long term target is to release these 3 projects independently?

> Hi Nigel,> >> As I look more at the impact of the common/MR/HDFS project split on what>> and how we release Hadoop, I feel like the split needs an adjustment. Many>> folks I've talked to agree that the project split has caused us a splitting>> headache. I think 1 relatively small change could alleviate some of that.> > Could you elaborate your idea on how the proposed changes would help? What the >> problems are being addressed? It is not clear to me.

Critical in my mind was my statement: "We're a long way from releasing these 3 projects independently. Given that, they should be branched and released as a unit." This can not be enforced given the current svn layout. Other's can weigh in with additional thoughts.

> You are right that the change is small but the impact is huge. We should first >> understand what we are getting from the changes before doing it.

On the flipside, right now it's only the developers, QAs and releaseengineers. There hasn't been much movement to 0.21 yet, and if we're agreedon the change in general, then pushing out 0.22 without it means makingusers change everything twice.

> This is a kind of an incompatible change: all the developers, QAs, release> engineers and users have to change their local settings and scripts for> this> change. Moreover, there are documentations, web pages and existing tools> using> the Apache svn URLs. So it is a huge impact. I am conservative on this> since,> as Konstantin mentioned, we risk to get into the same mess, and it will> create> more work for the community.>> Why do we want to enforce the releases as a unit, given that the long term> target is to release these 3 projects independently?>> Nicholas>>>>>> ________________________________> From: Nigel Daley <[EMAIL PROTECTED]>> To: [EMAIL PROTECTED]> Sent: Fri, January 14, 2011 11:21:25 AM> Subject: Re: [DISCUSS] Move project split down a level>>> On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:>> > Hi Nigel,> >> >> As I look more at the impact of the common/MR/HDFS project split on what> >> and how we release Hadoop, I feel like the split needs an adjustment.> Many> >> folks I've talked to agree that the project split has caused us a> splitting> >> headache. I think 1 relatively small change could alleviate some of> that.> >> > Could you elaborate your idea on how the proposed changes would help?> What the> >> > problems are being addressed? It is not clear to me.>> Critical in my mind was my statement: "We're a long way from releasing> these 3> projects independently. Given that, they should be branched and released> as a> unit." This can not be enforced given the current svn layout. Other's can> weigh> in with additional thoughts.>> > You are right that the change is small but the impact is huge. We should> first> >> > understand what we are getting from the changes before doing it.>> What do you see as the huge impact?>> Nige>

However, I would like to see a detailed proposal on how this will be done and discussions on it, before moving forward on this. If this work is done, need clear messages to the developers on what has changed, and how development process is affected. These details were missing when project split was done, causing great deal of confusion and pain.

We should also address the following:# Is project split a goal for hadoop in the future (even though we are not ready yet?). If it is, then putting projects back together might result in tight dependencies between the project. Ho do we avoid it?# The committer list for each of the sub project today is different. How do we reconcile them?On 1/14/11 11:53 AM, "Tsz Wo (Nicholas), Sze" <[EMAIL PROTECTED]> wrote:

This is a kind of an incompatible change: all the developers, QAs, releaseengineers and users have to change their local settings and scripts for thischange. Moreover, there are documentations, web pages and existing tools usingthe Apache svn URLs. So it is a huge impact. I am conservative on this since,as Konstantin mentioned, we risk to get into the same mess, and it will createmore work for the community.

Why do we want to enforce the releases as a unit, given that the long termtarget is to release these 3 projects independently?

> Hi Nigel,>>> As I look more at the impact of the common/MR/HDFS project split on what>> and how we release Hadoop, I feel like the split needs an adjustment. Many>> folks I've talked to agree that the project split has caused us a splitting>> headache. I think 1 relatively small change could alleviate some of that.>> Could you elaborate your idea on how the proposed changes would help? What the>> problems are being addressed? It is not clear to me.

Critical in my mind was my statement: "We're a long way from releasing these 3projects independently. Given that, they should be branched and released as aunit." This can not be enforced given the current svn layout. Other's can weighin with additional thoughts.

> You are right that the change is small but the impact is huge. We should first>> understand what we are getting from the changes before doing it.

Just to be clear, the proposal currently being discussed is NOT a full undo of the split -- it might be better described as a tweak or a bug fix to the (on-going) project split. If someone would like to start a discussion on a complete undo of the project split, please do so under a different thread.

Good questions. Keep them coming! I'll compile a list so we can start an FAQ on this.

> # Is project split a goal for hadoop in the future (even though we are not ready yet?). If it is, then putting projects back together might result in tight dependencies between the project. Ho do we avoid it?Note, we're not putting them back together. This is NOT a cmd-Z (ctrl-Z) on the project split. It's putting them back under one trunk, but as separate projects underneath that. IMO this is a relatively small change in the universe of undo-the-project-split possibilities.

> # The committer list for each of the sub project today is different. How do we reconcile them?

I'd like to keep that issue out of this change if at all possible. I recommend for now we keep the status quo. Thus, even though all committers may technically have permission to commit to all 3 project trees (can someone confirm that?), we would need to rely on the honor system that committers will only commit to their project trees.

Cheers,Nige

On Jan 14, 2011, at 12:00 PM, Suresh Srinivas wrote:

> I like the idea of merging projects together. It save a lot of time.> > However, I would like to see a detailed proposal on how this will be done and discussions on it, before moving forward on this. If this work is done, need clear messages to the developers on what has changed, and how development process is affected. These details were missing when project split was done, causing great deal of confusion and pain.> > We should also address the following:> # Is project split a goal for hadoop in the future (even though we are not ready yet?). If it is, then putting projects back together might result in tight dependencies between the project. Ho do we avoid it?> # The committer list for each of the sub project today is different. How do we reconcile them?> > > On 1/14/11 11:53 AM, "Tsz Wo (Nicholas), Sze" <[EMAIL PROTECTED]> wrote:> > This is a kind of an incompatible change: all the developers, QAs, release> engineers and users have to change their local settings and scripts for this> change. Moreover, there are documentations, web pages and existing tools using> the Apache svn URLs. So it is a huge impact. I am conservative on this since,> as Konstantin mentioned, we risk to get into the same mess, and it will create> more work for the community.> > Why do we want to enforce the releases as a unit, given that the long term> target is to release these 3 projects independently?> > Nicholas> > > > > > ________________________________> From: Nigel Daley <[EMAIL PROTECTED]>> To: [EMAIL PROTECTED]> Sent: Fri, January 14, 2011 11:21:25 AM> Subject: Re: [DISCUSS] Move project split down a level> > > On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:> >> Hi Nigel,>> >>> As I look more at the impact of the common/MR/HDFS project split on what>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>> folks I've talked to agree that the project split has caused us a splitting>>> headache. I think 1 relatively small change could alleviate some of that.>> >> Could you elaborate your idea on how the proposed changes would help? What the>> >> problems are being addressed? It is not clear to me.> > Critical in my mind was my statement: "We're a long way from releasing these 3> projects independently. Given that, they should be branched and released as a> unit." This can not be enforced given the current svn layout. Other's can weigh> in with additional thoughts.> >> You are right that the change is small but the impact is huge. We should first>> >> understand what we are getting from the changes before doing it.> > What do you see as the huge impact?> > Nige>

> This is a kind of an incompatible change: all the developers, QAs, release > engineers and users have to change their local settings and scripts for this > change.

I have a hard time believing this as I suspect the very small set of folks that test/deploy post-project split releases (0.21/0.22/trunk) have been smashing the 3 projects back together for test/deploy purposes on a cluster. You *will* have to change your personal and your build machine SVN checkout urls, but beyond that, the projects remain as-is in separate trees. When we mavenize, that will perhaps cause the disruption you're mentioning, but that is a separate issue/discussion from this one.

> Moreover, there are documentations, web pages and existing tools using > the Apache svn URLs.

I sign up to correct the wiki, site, and Apache Hudson builds and build scripts (although help is gratefully received).

> So it is a huge impact. I am conservative on this since, > as Konstantin mentioned, we risk to get into the same mess, and it will create > more work for the community.

I don't believe we have exited from the previous mess. This could actually help us do that faster. You may not have noticed that 0.21 was released as a single smashed together tar ball. 0.22 IMO is heading for the same kind of release.

> Why do we want to enforce the releases as a unit, given that the long term > target is to release these 3 projects independently?

Because that long term view is currently a fantasy with no real end in sight.

Nige> ________________________________> From: Nigel Daley <[EMAIL PROTECTED]>> To: [EMAIL PROTECTED]> Sent: Fri, January 14, 2011 11:21:25 AM> Subject: Re: [DISCUSS] Move project split down a level> > > On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:> >> Hi Nigel,>> >>> As I look more at the impact of the common/MR/HDFS project split on what>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>> folks I've talked to agree that the project split has caused us a splitting>>> headache. I think 1 relatively small change could alleviate some of that.>> >> Could you elaborate your idea on how the proposed changes would help? What the >> >> problems are being addressed? It is not clear to me.> > Critical in my mind was my statement: "We're a long way from releasing these 3 > projects independently. Given that, they should be branched and released as a > unit." This can not be enforced given the current svn layout. Other's can weigh > in with additional thoughts.> >> You are right that the change is small but the impact is huge. We should first >> >> understand what we are getting from the changes before doing it.> > What do you see as the huge impact?> > Nige

On Mon, Jan 17, 2011 at 9:04 PM, Nigel Daley <[EMAIL PROTECTED]> wrote:> Good questions. Keep them coming! I'll compile a list so we can start an FAQ on this.>>> # Is project split a goal for hadoop in the future (even though we are not ready yet?). If it is, then putting projects back together might result in tight dependencies between the project. Ho do we avoid it?>>> Note, we're not putting them back together. This is NOT a cmd-Z (ctrl-Z) on the project split. It's putting them back under one trunk, but as separate projects underneath that. IMO this is a relatively small change in the universe of undo-the-project-split possibilities.

>>> # The committer list for each of the sub project today is different. How do we reconcile them?>> I'd like to keep that issue out of this change if at all possible. I recommend for now we keep the status quo. Thus, even though all committers may technically have permission to commit to all 3 project trees (can someone confirm that?), we would need to rely on the honor system that committers will only commit to their project trees.

That's right - we don't have fine-grained commit access on the Hadooptree, so we don't need to change anything here.

Tom

>> Cheers,> Nige>> On Jan 14, 2011, at 12:00 PM, Suresh Srinivas wrote:>>> I like the idea of merging projects together. It save a lot of time.>>>> However, I would like to see a detailed proposal on how this will be done and discussions on it, before moving forward on this. If this work is done, need clear messages to the developers on what has changed, and how development process is affected. These details were missing when project split was done, causing great deal of confusion and pain.>>>> We should also address the following:>> # Is project split a goal for hadoop in the future (even though we are not ready yet?). If it is, then putting projects back together might result in tight dependencies between the project. Ho do we avoid it?>> # The committer list for each of the sub project today is different. How do we reconcile them?>>>>>> On 1/14/11 11:53 AM, "Tsz Wo (Nicholas), Sze" <[EMAIL PROTECTED]> wrote:>>>> This is a kind of an incompatible change: all the developers, QAs, release>> engineers and users have to change their local settings and scripts for this>> change. Moreover, there are documentations, web pages and existing tools using>> the Apache svn URLs. So it is a huge impact. I am conservative on this since,>> as Konstantin mentioned, we risk to get into the same mess, and it will create>> more work for the community.>>>> Why do we want to enforce the releases as a unit, given that the long term>> target is to release these 3 projects independently?>>>> Nicholas>>>>>>>>>>>> ________________________________>> From: Nigel Daley <[EMAIL PROTECTED]>>> To: [EMAIL PROTECTED]>> Sent: Fri, January 14, 2011 11:21:25 AM>> Subject: Re: [DISCUSS] Move project split down a level>>>>>> On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:>>>>> Hi Nigel,>>>>>>> As I look more at the impact of the common/MR/HDFS project split on what>>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>>> folks I've talked to agree that the project split has caused us a splitting>>>> headache. I think 1 relatively small change could alleviate some of that.>>>>>> Could you elaborate your idea on how the proposed changes would help? What the>>>>>> problems are being addressed? It is not clear to me.>>>> Critical in my mind was my statement: "We're a long way from releasing these 3>> projects independently. Given that, they should be branched and released as a>> unit." This can not be enforced given the current svn layout. Other's can weigh

> > On Jan 14, 2011, at 11:53 AM, Tsz Wo (Nicholas), Sze wrote:>> ...>> Why do we want to enforce the releases as a unit, given that the long term >> target is to release these 3 projects independently?> > Because that long term view is currently a fantasy with no real end in sight.

** +1 to that. We release as a unit, branch as a unit, test as a unit, deploy as a unit. I've seen no actual gain from the project split, just complexity. Simplifying things in the short term seems like a good goal.

> Nige> > >> ________________________________>> From: Nigel Daley <[EMAIL PROTECTED]>>> To: [EMAIL PROTECTED]>> Sent: Fri, January 14, 2011 11:21:25 AM>> Subject: Re: [DISCUSS] Move project split down a level>> >> >> On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:>> >>> Hi Nigel,>>> >>>> As I look more at the impact of the common/MR/HDFS project split on what>>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>>> folks I've talked to agree that the project split has caused us a splitting>>>> headache. I think 1 relatively small change could alleviate some of that.>>> >>> Could you elaborate your idea on how the proposed changes would help? What the >>> >>> problems are being addressed? It is not clear to me.>> >> Critical in my mind was my statement: "We're a long way from releasing these 3 >> projects independently. Given that, they should be branched and released as a >> unit." This can not be enforced given the current svn layout. Other's can weigh >> in with additional thoughts.>> >>> You are right that the change is small but the impact is huge. We should first >>> >>> understand what we are getting from the changes before doing it.>> >> What do you see as the huge impact?>> >> Nige>

On Mon, Jan 17, 2011 at 21:40, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>> On Jan 17, 2011, at 9:13 PM, Nigel Daley wrote:>>>>> On Jan 14, 2011, at 11:53 AM, Tsz Wo (Nicholas), Sze wrote:>>>> ...>>> Why do we want to enforce the releases as a unit, given that the long term>>> target is to release these 3 projects independently?>>>> Because that long term view is currently a fantasy with no real end in sight.>> ** +1 to that. We release as a unit, branch as a unit, test as a unit, deploy as a unit. I've seen no actual gain from the project split, just complexity.

Am I missing something in the latest development of Hadoop, Eric? Whatdo you mean by 'we... test as a unit'? Is it like we have testartifacts version'd against Hadoop release proper? Or you are tryingto say something else? It isn't very clear, sorry...

Cos>> Nige>>>>>>> ________________________________>>> From: Nigel Daley <[EMAIL PROTECTED]>>>> To: [EMAIL PROTECTED]>>> Sent: Fri, January 14, 2011 11:21:25 AM>>> Subject: Re: [DISCUSS] Move project split down a level>>>>>>>>> On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:>>>>>>> Hi Nigel,>>>>>>>>> As I look more at the impact of the common/MR/HDFS project split on what>>>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>>>> folks I've talked to agree that the project split has caused us a splitting>>>>> headache. I think 1 relatively small change could alleviate some of that.>>>>>>>> Could you elaborate your idea on how the proposed changes would help? What the>>>>>>>> problems are being addressed? It is not clear to me.>>>>>> Critical in my mind was my statement: "We're a long way from releasing these 3>>> projects independently. Given that, they should be branched and released as a>>> unit." This can not be enforced given the current svn layout. Other's can weigh>>> in with additional thoughts.>>>>>>> You are right that the change is small but the impact is huge. We should first>>>>>>>> understand what we are getting from the changes before doing it.>>>>>> What do you see as the huge impact?>>>>>> Nige>>>>

Todd, left a note there for you to add in a link to a git-history-fixer-script Jira when the time comes. If folks find more docs that will need updating, please add them to the list at the end of the FAQ.

We still need to fill in the exact steps that developers will need to perform to change their workspaces once this change is made.

Nige

On Jan 17, 2011, at 9:04 PM, Nigel Daley wrote:

> Good questions. Keep them coming! I'll compile a list so we can start an FAQ on this.> >> # Is project split a goal for hadoop in the future (even though we are not ready yet?). If it is, then putting projects back together might result in tight dependencies between the project. Ho do we avoid it?> > > Note, we're not putting them back together. This is NOT a cmd-Z (ctrl-Z) on the project split. It's putting them back under one trunk, but as separate projects underneath that. IMO this is a relatively small change in the universe of undo-the-project-split possibilities.> >> # The committer list for each of the sub project today is different. How do we reconcile them?> > I'd like to keep that issue out of this change if at all possible. I recommend for now we keep the status quo. Thus, even though all committers may technically have permission to commit to all 3 project trees (can someone confirm that?), we would need to rely on the honor system that committers will only commit to their project trees.> > Cheers,> Nige> > On Jan 14, 2011, at 12:00 PM, Suresh Srinivas wrote:> >> I like the idea of merging projects together. It save a lot of time.>> >> However, I would like to see a detailed proposal on how this will be done and discussions on it, before moving forward on this. If this work is done, need clear messages to the developers on what has changed, and how development process is affected. These details were missing when project split was done, causing great deal of confusion and pain.>> >> We should also address the following:>> # Is project split a goal for hadoop in the future (even though we are not ready yet?). If it is, then putting projects back together might result in tight dependencies between the project. Ho do we avoid it?>> # The committer list for each of the sub project today is different. How do we reconcile them?>> >> >> On 1/14/11 11:53 AM, "Tsz Wo (Nicholas), Sze" <[EMAIL PROTECTED]> wrote:>> >> This is a kind of an incompatible change: all the developers, QAs, release>> engineers and users have to change their local settings and scripts for this>> change. Moreover, there are documentations, web pages and existing tools using>> the Apache svn URLs. So it is a huge impact. I am conservative on this since,>> as Konstantin mentioned, we risk to get into the same mess, and it will create>> more work for the community.>> >> Why do we want to enforce the releases as a unit, given that the long term>> target is to release these 3 projects independently?>> >> Nicholas>> >> >> >> >> >> ________________________________>> From: Nigel Daley <[EMAIL PROTECTED]>>> To: [EMAIL PROTECTED]>> Sent: Fri, January 14, 2011 11:21:25 AM>> Subject: Re: [DISCUSS] Move project split down a level>> >> >> On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:>> >>> Hi Nigel,>>> >>>> As I look more at the impact of the common/MR/HDFS project split on what>>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>>> folks I've talked to agree that the project split has caused us a splitting>>>> headache. I think 1 relatively small change could alleviate some of that.>>> >>> Could you elaborate your idea on how the proposed changes would help? What the>>> >>> problems are being addressed? It is not clear to me.>> >> Critical in my mind was my statement: "We're a long way from releasing these 3>> projects independently. Given that, they should be branched and released as a

Nigel proposes that in this release (as in previous releases), everything should be packaged together.

Our in house experience at yahoo is that this makes a lot of sense. It is how we find it most effective to operate. The project split has introduced a lot of complexity with no return.

Do you see any advantage to the status quo, versus nigel's proposal?

Thanks! And sorry for any ambiguity.

E14

On Jan 17, 2011, at 9:54 PM, Konstantin Boudnik wrote:

> On Mon, Jan 17, 2011 at 21:40, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>> >> On Jan 17, 2011, at 9:13 PM, Nigel Daley wrote:>> >>> >>> On Jan 14, 2011, at 11:53 AM, Tsz Wo (Nicholas), Sze wrote:>>>> >> ...>>>> Why do we want to enforce the releases as a unit, given that the long term>>>> target is to release these 3 projects independently?>>> >>> Because that long term view is currently a fantasy with no real end in sight.>> >> ** +1 to that. We release as a unit, branch as a unit, test as a unit, deploy as a unit. I've seen no actual gain from the project split, just complexity.> > Am I missing something in the latest development of Hadoop, Eric? What> do you mean by 'we... test as a unit'? Is it like we have test> artifacts version'd against Hadoop release proper? Or you are trying> to say something else? It isn't very clear, sorry...> > Cos> > >>> Nige>>> >>> >>>> ________________________________>>>> From: Nigel Daley <[EMAIL PROTECTED]>>>>> To: [EMAIL PROTECTED]>>>> Sent: Fri, January 14, 2011 11:21:25 AM>>>> Subject: Re: [DISCUSS] Move project split down a level>>>> >>>> >>>> On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:>>>> >>>>> Hi Nigel,>>>>> >>>>>> As I look more at the impact of the common/MR/HDFS project split on what>>>>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>>>>> folks I've talked to agree that the project split has caused us a splitting>>>>>> headache. I think 1 relatively small change could alleviate some of that.>>>>> >>>>> Could you elaborate your idea on how the proposed changes would help? What the>>>>> >>>>> problems are being addressed? It is not clear to me.>>>> >>>> Critical in my mind was my statement: "We're a long way from releasing these 3>>>> projects independently. Given that, they should be branched and released as a>>>> unit." This can not be enforced given the current svn layout. Other's can weigh>>>> in with additional thoughts.>>>> >>>>> You are right that the change is small but the impact is huge. We should first>>>>> >>>>> understand what we are getting from the changes before doing it.>>>> >>>> What do you see as the huge impact?>>>> >>>> Nige>>> >> >>

Packaging everything together makes sense for unit/functional level oftests. Since Hadoop in current shape doesn't have any other kinds oftests (despite of limited system tests implemented within Herriotframework) there's no objections per se.

And you know my take on Hadoop's stack testing for it proved to bebeneficial for Y! security release stabilization in particular: stacktesting has to be done by a set of separate component specific testingartifacts tailored together for the particular stack's validation.

But since we aren't anywhere near the noble goal (as correctlymentioned by Nigel et all) we don't need to make any changes in out"bits + functional tests" packaging schema. Thanks for clarificationof your point,btw.

Cos

On Mon, Jan 17, 2011 at 22:33, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:> Nigel proposes that in this release (as in previous releases), everything should be packaged together.>> Our in house experience at yahoo is that this makes a lot of sense. It is how we find it most effective to operate. The project split has introduced a lot of complexity with no return.>> Do you see any advantage to the status quo, versus nigel's proposal?>> Thanks! And sorry for any ambiguity.>> E14>>>> On Jan 17, 2011, at 9:54 PM, Konstantin Boudnik wrote:>>> On Mon, Jan 17, 2011 at 21:40, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote:>>>>>> On Jan 17, 2011, at 9:13 PM, Nigel Daley wrote:>>>>>>>>>>> On Jan 14, 2011, at 11:53 AM, Tsz Wo (Nicholas), Sze wrote:>>>>>>>> ...>>>>> Why do we want to enforce the releases as a unit, given that the long term>>>>> target is to release these 3 projects independently?>>>>>>>> Because that long term view is currently a fantasy with no real end in sight.>>>>>> ** +1 to that. We release as a unit, branch as a unit, test as a unit, deploy as a unit. I've seen no actual gain from the project split, just complexity.>>>> Am I missing something in the latest development of Hadoop, Eric? What>> do you mean by 'we... test as a unit'? Is it like we have test>> artifacts version'd against Hadoop release proper? Or you are trying>> to say something else? It isn't very clear, sorry...>>>> Cos>>>>>>>> Nige>>>>>>>>>>>>> ________________________________>>>>> From: Nigel Daley <[EMAIL PROTECTED]>>>>>> To: [EMAIL PROTECTED]>>>>> Sent: Fri, January 14, 2011 11:21:25 AM>>>>> Subject: Re: [DISCUSS] Move project split down a level>>>>>>>>>>>>>>> On Jan 14, 2011, at 11:16 AM, Tsz Wo (Nicholas), Sze wrote:>>>>>>>>>>> Hi Nigel,>>>>>>>>>>>>> As I look more at the impact of the common/MR/HDFS project split on what>>>>>>> and how we release Hadoop, I feel like the split needs an adjustment. Many>>>>>>> folks I've talked to agree that the project split has caused us a splitting>>>>>>> headache. I think 1 relatively small change could alleviate some of that.>>>>>>>>>>>> Could you elaborate your idea on how the proposed changes would help? What the>>>>>>>>>>>> problems are being addressed? It is not clear to me.>>>>>>>>>> Critical in my mind was my statement: "We're a long way from releasing these 3>>>>> projects independently. Given that, they should be branched and released as a>>>>> unit." This can not be enforced given the current svn layout. Other's can weigh>>>>> in with additional thoughts.>>>>>>>>>>> You are right that the change is small but the impact is huge. We should first>>>>>>>>>>>> understand what we are getting from the changes before doing it.>>>>>>>>>> What do you see as the huge impact?>>>>>>>>>> Nige>>>>>>>>>>>>

>We actually still haven't recovered from the projects split.>We are still fixing HDFS and MR scripts with several jiras open.>>If we start this re-split now again before the major release>we risk to get into the same mess, and it will create more work>for the community.>>I see Nigel's point that packaging will get easier, and developers>will push less buttons when they commit.>It will delay the release - this is what worries me.>>Thanks,>--Konstantin

I don't see how it would change any of those scripts. All three projectswould still be isolated with the same internal path structures, theirpaths relative to each other in svn would be the only difference.

For developers, this might mean simply using 'svn switch' once and nothingmore.

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext