As someone who has been part of this effort from inception, I am glad thatwe have reached this stable state in the project on both branches ofHadoop.It has been a great collaboration across teams and engineers and opens upHadoop to a whole new set of deployments and developers!

I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.I am happy to announce that we are ready for the merge.

Here is a brief recap on the highlights of the work done:- Command-line scripts for the Hadoop surface area- Mapping the HDFS permissions model to Windows- Abstracted and reconciled mismatches around differences in Pathsemantics in Java and Windows- Native Task Controller for Windows- Implementation of a Block Placement Policy to support cloudenvironments, more specifically Azure.- Implementation of Hadoop native libraries for Windows (compressioncodecs, native I/O)- Several reliability issues, including race-conditions, intermittent testfailures, resource leaks.- Several new unit test cases written for the above changes

> It is super exciting to look at the prospect of these changes being> merged to trunk. Having Windows as one of the supported Hadoop> platforms is a fantastic opportunity both for the Hadoop project andMicrosoft customers.>> This work began around a year back when a few of us started with a> basic port of Hadoop on Windows. Ever since, the Hadoop team in> Microsoft have made significant progress in the following areas:> (PS: Some of these items are already included in Suresh's email, but> including again for completeness)>> - Command-line scripts for the Hadoop surface area> - Mapping the HDFS permissions model to Windows> - Abstracted and reconciled mismatches around differences in Path> semantics in Java and Windows> - Native Task Controller for Windows> - Implementation of a Block Placement Policy to support cloud> environments, more specifically Azure.> - Implementation of Hadoop native libraries for Windows (compression> codecs, native I/O) - Several reliability issues, including> race-conditions, intermittent test failures, resource leaks.> - Several new unit test cases written for the above changes>> In the process, we have closely engaged with the Apache open source> community and have got great support and assistance from the community> in terms of contributing fixes, code review comments and commits.releases.data challenges.branch-1.http://hortonworks.com/download/

I've been testing and patching this branch for the past several months, andI believe we've reached stability for a merge to trunk. I want to pointout once again that the branch has been tested on Linux to build confidencethat regressions were not introduced on existing platforms. I also want tosecond the comments from Bikas about how wonderful the collaboration hasbeen!

> +1>> As someone who has been part of this effort from inception, I am glad that> we have reached this stable state in the project on both branches of> Hadoop.> It has been a great collaboration across teams and engineers and opens up> Hadoop to a whole new set of deployments and developers!>> Bikas>> -----Original Message-----> From: Suresh Srinivas [mailto:[EMAIL PROTECTED]]> Sent: Tuesday, February 26, 2013 2:56 PM> To: [EMAIL PROTECTED]> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];> [EMAIL PROTECTED]> Subject: [Vote] Merge branch-trunk-win to trunk>> I had posted heads up about merging branch-trunk-win to trunk on Feb 8th.> I am happy to announce that we are ready for the merge.>> Here is a brief recap on the highlights of the work done:> - Command-line scripts for the Hadoop surface area> - Mapping the HDFS permissions model to Windows> - Abstracted and reconciled mismatches around differences in Path> semantics in Java and Windows> - Native Task Controller for Windows> - Implementation of a Block Placement Policy to support cloud> environments, more specifically Azure.> - Implementation of Hadoop native libraries for Windows (compression> codecs, native I/O)> - Several reliability issues, including race-conditions, intermittent test> failures, resource leaks.> - Several new unit test cases written for the above changes>> Please find the details of the work in CHANGES.branch-trunk-win.txt -> Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,> and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work> ported from branch-1-win to a branch based on trunk.>> For details of the testing done, please see the thread -> http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<> https://issues.apache.org/jira/browse/HADOOP-8562>.>> This was a large undertaking that involved developing code, testing the> entire Hadoop stack, including scale tests. This is made possible only> with the contribution from many many folks in the community. Following> people contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas> Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao,> Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing> Zhao, Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan> Kesavan, Ramya Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo> Nicholas Sze, Suresh Srinivas and Sanjay Radia. There are many others who> contributed as well providing feedback and comments on numerous jiras.>> The vote will run for seven days and will end on March 5, 6:00PM PST.>> Regards,> Suresh>>>>> On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman> <[EMAIL PROTECTED]>wrote:>> > It is super exciting to look at the prospect of these changes being> > merged to trunk. Having Windows as one of the supported Hadoop> > platforms is a fantastic opportunity both for the Hadoop project and> Microsoft customers.> >> > This work began around a year back when a few of us started with a> > basic port of Hadoop on Windows. Ever since, the Hadoop team in> > Microsoft have made significant progress in the following areas:> > (PS: Some of these items are already included in Suresh's email, but> > including again for completeness)> >> > - Command-line scripts for the Hadoop surface area> > - Mapping the HDFS permissions model to Windows> > - Abstracted and reconciled mismatches around differences in Path

> I had posted heads up about merging branch-trunk-win to trunk on Feb 8th. I> am happy to announce that we are ready for the merge.>> Here is a brief recap on the highlights of the work done:> - Command-line scripts for the Hadoop surface area> - Mapping the HDFS permissions model to Windows> - Abstracted and reconciled mismatches around differences in Path semantics> in Java and Windows> - Native Task Controller for Windows> - Implementation of a Block Placement Policy to support cloud environments,> more specifically Azure.> - Implementation of Hadoop native libraries for Windows (compression> codecs, native I/O)> - Several reliability issues, including race-conditions, intermittent test> failures, resource leaks.> - Several new unit test cases written for the above changes>> Please find the details of the work in CHANGES.branch-trunk-win.txt -> Common changes<http://bit.ly/Xe7Ynv>, HDFS changes<http://bit.ly/13QOSo9>,> and YARN and MapReduce changes <http://bit.ly/128zzMt>. This is the work> ported from branch-1-win to a branch based on trunk.>> For details of the testing done, please see the thread -> http://bit.ly/WpavJ4. Merge patch for this is available on HADOOP-8562<> https://issues.apache.org/jira/browse/HADOOP-8562>.>> This was a large undertaking that involved developing code, testing the> entire Hadoop stack, including scale tests. This is made possible only with> the contribution from many many folks in the community. Following people> contributed to this work: Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas Saha,> Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, Sumadhur> Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, Thejas> Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, Ramya> Bharathi Nimmagadda, Daryn Sharp, Arun Murthy, Tsz-Wo Nicholas Sze, Suresh> Srinivas and Sanjay Radia. There are many others who contributed as well> providing feedback and comments on numerous jiras.>> The vote will run for seven days and will end on March 5, 6:00PM PST.>> Regards,> Suresh>>>>> On Thu, Feb 7, 2013 at 6:41 PM, Mahadevan Venkatraman> <[EMAIL PROTECTED]>wrote:>> > It is super exciting to look at the prospect of these changes being> merged> > to trunk. Having Windows as one of the supported Hadoop platforms is a> > fantastic opportunity both for the Hadoop project and Microsoft> customers.> >> > This work began around a year back when a few of us started with a basic> > port of Hadoop on Windows. Ever since, the Hadoop team in Microsoft have> > made significant progress in the following areas:> > (PS: Some of these items are already included in Suresh's email, but> > including again for completeness)> >> > - Command-line scripts for the Hadoop surface area> > - Mapping the HDFS permissions model to Windows> > - Abstracted and reconciled mismatches around differences in Path> > semantics in Java and Windows> > - Native Task Controller for Windows> > - Implementation of a Block Placement Policy to support cloud> > environments, more specifically Azure.> > - Implementation of Hadoop native libraries for Windows (compression> > codecs, native I/O) - Several reliability issues, including> > race-conditions, intermittent test failures, resource leaks.> > - Several new unit test cases written for the above changes> >> > In the process, we have closely engaged with the Apache open source> > community and have got great support and assistance from the community in> > terms of contributing fixes, code review comments and commits.> >> > In addition, the Hadoop team at Microsoft has also made good progress in> > other projects including Hive, Pig, Sqoop, Oozie, HCat and HBase. Many of> > these changes have already been committed to the respective trunks with

As someone also contributed to porting Hadoop to Windows, I think Java already provided a very good platform independent platform.For features that are not available in Java, we will try to provide our platform independent APIs that abstract OS tasks away.Most features should have no difficulty running on Windows and Linux by using Java and those platform independent APIs.

For concerns raise on new features that may fail on Windows, I think we don't need to require passing on Windows a mandate at the moment. We can simply mark it unavailable to Windows and port it later if the feature is important.

> Is there a jira for resolving the outstanding TODOs in the code base > (similar to HDFS-2148)? Looks like this merge doesn't introduce many > which is great (just did a quick diff and grep).

I found 2 remaining TODOs introduced in the current merge patch. One is in ContainerLaunch.java. The container launch script was trying to set a CLASSPATH that exceeded the Windows maximum command line length. The fix was to wrap the long classpath into an intermediate jar containing only a manifest file with a Class-Path entry. (See YARN-316.) Just to be conservative, we wrapped this logic in an if (Shell.WINDOWS) guard and marked a TODO to remove it later and use that approach on all platforms after additional testing. I've tested this code path successfully on Mac too, but several people wanted additional testing and performance checks before removing the if (Shell.WINDOWS) guard. That work is tracked in an existing jira: YARN-358.

The other TODO is for winutils to print more usage information and examples. At this point, I think winutils is printing sufficient information, and we can just remove the TODO. I just submitted a new jira to start that conversation: HADOOP-9348.

> My initial question was mostly intended to understand the desired new > classification of Windows after the merge, and how we plan to maintain > Windows support. I am happy to hear that hardware for Jenkins will be > provided. I am also fine, at least initially, with us trying to treat > Windows as a first class supported platform. But I realize that there > are a lot of people that do not have easy access to Windows for > development/debugging, myself included. I also don't want to slow down > the pace of development too much because of this. It will cause some > organizations that do not use or support Windows to be more likely to > run software that has diverged from an official release. It also has > the potential to make the patch submission process even more > difficult, which increases the likelihood of submitters abandoning > patches. However, the great thing about being in a community is we can change if we need to.>> I am +0 for the merge. I am not a Windows expert so I don't feel > comfortable giving it a true +1.>> --Bobby>>> On 2/28/13 10:45 AM, "Chris Nauroth" <[EMAIL PROTECTED]> wrote:>> >I'd like to share a few anecdotes about developing cross-platform, > >hopefully to address some of the concerns about adding overhead to > >the development process. By reviewing past cases of cross-platform Linux vs.> >Windows bugs, we can get a sense for how the development process > >could look in the future.> >> >HADOOP-9131: TestLocalFileSystem#testListStatusWithColons cannot run > >on Windows. As part of an earlier jira, HADOOP-8962, there was a new > >test committed on trunk covering the case of a local file system > >interaction on a file containing a ':'. On Windows, ':' in a path

- Windows has actually been a supported platform for Hadoop since 0.1 . Doug championed supporting windows then and we've continued to do it with varying vigor over time. To my knowledge we've never made a decision to drop windows support. The change here is improving our support and dropping the requirement of cigwin. We had Nutch windows users on the list in 2006 and we've been supporting windows FS requirements since inception.

- A little pragmatism will go a long way. As a community we've got to stay committed to keeping hadoop simple (so it does work on many platforms) and extending it to take advantage of key emerging OS/hardware features, such as containers, new FSs, virtualization, flash ... We should all plan to let new features & optimizations emerge that don't work everywhere, if they are compelling and central to hadoop's mission of being THE best fabric for storing and processing big data.

- A UI project like KDE has to deal with the MANY differences between windows and linux UI APIs. Hadoop faces no such complex challenge and hence can be maintained from a single codeline IMO. It is mostly abstracted from the OS APIs via Java and our design choices. Where it is not we can continue to add plugable abstractions.

On Feb 28, 2013, at 6:01 PM, Matt Foley <[EMAIL PROTECTED]> wrote:

> +1 (binding)> > Apache is supposed to be about the community. We have here a community of> developers, who have actively and openly worked to add a major improvement> to Hadoop: the ability to work cross-platform. Furthermore, the size of> the substantive part of the needed patch is only about 1500 lines, much> smaller than quite a few other additions to Hadoop over the last few> months. We should welcome and support this change, and make sure that the> code stays cross-platform going forward by extending our CI practices,> especially pre-commit "test-patch", to also include Windows.> > As most of you know, my colleague Giri Kesavan (PMC member) helps maintain> the Linux CI capability for Hadoop. I've talked with him, and he and I are> committing to getting test-patch implemented for Windows, so that along> with the current automated "+1"s required to commit, we can add two more,> for javac build in Windows and core unit tests in Windows.> > Members of the team implementing cross-platform compatibility, including> Microsoft employees, have opened the discussion for providing hardware or> VM resources to perform this additional CI testing. I will assist them to> work with the Apache Infra team and figure out how to make it happen.> > I understand there is some concern about the additional platform test.> My going-in> presumption, based on Java's intrinsic, pretty-good, cross-platform> compatibility, is that patches to Hadoop will by default also have> cross-platform compatibility, unless they are written in an explicitly> platform-dependent way. I also believe that in the vast majority of cases> the cross-platform compatibility of Java will carry thru to Hadoop patches,> without additional effort on the developer's part.> > Let's try it, and see what happens. If we actually find a frequent> difficulty, we'll change to engineer around it. But I believe that, in the> rare cases where a Windows-specific failure occurs, there will be a number> of people (new, enthusiastic members of the community! :-) willing to help.> If such help is not forthcoming, then we can discuss work-arounds, but> like a previous poster, I am confident in the community.> > Regards,> --Matt> > > > On Thu, Feb 28, 2013 at 12:21 PM, Chuan Liu <[EMAIL PROTECTED]> wrote:> >> +1 (non-binding)>>> >>> As someone also contributed to porting Hadoop to Windows, I think Java>>> already provided a very good platform independent platform.>>> For features that are not available in Java, we will try to provide our>>> platform independent APIs that abstract OS tasks away.

Have we agreed (and stated it somewhere proper) that a -1 obtained fora Windows CI build for a test-patch will not block the ongoing work(unless it is Windows specific) and patches may still be committed totrunk despite that?

I'm +1 if someone can assert and add the above into the formalguidelines. I'd still prefer that Windows does its releases separatelyas that ensures more quality for its audience and better testingperiods (and wouldn't block anything), but we can come to that iff weare unable to maintain the currently proposed model.

> Have we agreed (and stated it somewhere proper) that a -1 obtained for> a Windows CI build for a test-patch will not block the ongoing work> (unless it is Windows specific) and patches may still be committed to> trunk despite that?>

This thread is long and possibly hard to follow. Yes, I and several othershavestated that for now it is okay to commit even if Windows precommit buildposts -1.

>> I'm +1 if someone can assert and add the above into the formal> guidelines. I'd still prefer that Windows does its releases separately> as that ensures more quality for its audience and better testing> periods (and wouldn't block anything), but we can come to that iff we> are unable to maintain the currently proposed model.Which do you think is the right place to add this?

At this time we are voting for merging into trunk. I prefer having a singlereleasethat supports both Linux and windows. Based on working on Windows supportI think this is doable and should not hold up releases for Linux.

Konstantine, you have voted -1, and stated some requirements before you'llwithdraw that -1. As I plan to do work to fulfill those requirements, Iwant to make sure that what I'm proposing will, in fact, satisfy you. That's why I'm asking, if we implement full "test-patch" integration forWindows, does it seem to you that that would provide adequate support?

I have learned not to presume that my interpretation is correct. Myinterpretation of item #1 is that test-patch provides pre-commit build, soit would satisfy item #1. But rather than assuming that I am interpretingit correctly, I simply want your agreement that it would, or if not,clarification why it won't.

Regarding item #2, it is also my interpretation that test-patch provides anon-demand (perhaps 20-minutes deferred) Jenkins build and unit test, withlogs available to the developer, so it would satisfy item #2. But ratherthan assuming that I am interpreting it correctly, I simply want youragreement that it would, or if not, clarification why it won't.

In agile terms, you are the Owner of these requirements. Please give meowner feedback as to whether my proposed work sounds like it will satisfythe requirements.

> Didn't I explain in details what I am asking for?>> Thanks,> --Konst>> On Sun, Mar 3, 2013 at 11:08 AM, Matt Foley <[EMAIL PROTECTED]>> wrote:> > Hi Konstantin,> > I'd like to point out two things:> > First, I already committed in this thread (email of Thu, Feb 28, 2013 at> > 6:01 PM) to providing CI for Windows builds. So please stop acting like> I'm> > resisting this idea or something.> > Second, you didn't answer my question, you just kvetched about the> phrasing.> > So I ask again:> >> > Will providing full "test-patch" integration (pre-commit build and unit> test> > triggered by Jira "Patch Available" state) satisfy your request for> > functionality #1 and #2? Yes or no, please.> >> > Thanks,> > --Matt> >> >> > On Sat, Mar 2, 2013 at 7:32 PM, Konstantin Shvachko <> [EMAIL PROTECTED]>> > wrote:> >>> >> Hi Matt,> >>> >> On Sat, Mar 2, 2013 at 12:32 PM, Matt Foley <[EMAIL PROTECTED]>> >> wrote:> >> > Konstantin,> >> > I would like to explore what it would take to remove this perceived> >> > impediment --> >>> >> Glad you decided to explore. Thank you.> >>> >> > although I reserve the right to argue that this is not> >> > pre-requisite to merging the cross-platform support patch.> >>> >> It's your right indeed. So as mine to question what the platform> >> support means for you, which I believe remained unclear.> >> I do not impede the change as you should have noticed. My requirement> >> comes from my perception of the support, which means to me exactly two> >> things:> >> 1. The ability to recognise the code is broken for the platform> >> 2. The ability to test new patches on the platform> >> The latter is problematic, as many noticed in this thread, for those> >> whose customary environment does not include Windows.> >>> >> > If we implemented full "test-patch" support for Windows on trunk,> would> >> > that> >> > fulfill both your items #1 and #2? Please note that:> >> > a) Pushing the "Patch Available" button in Jira shall cause a> pre-commit> >> > build to start within, I believe, 20 minutes.> >> > b) That build keeps logs for both java build and unit tests for> several> >> > days, that are accessible to all viewers.> >>> >> In item #1 I mostly asking for the nightly build, which is simpler> >> than "test-patch". The latter would be ideal from the platform support> >> viewpoint, but it is for the community to decide if we want to add> >> extra +3 hours to the build.> >> Nightly build in my understanding is triggered by the timer rather> >> than by Jira's "submit patch" button. On Jenkins build configuration> >> you can specify it under "Build periodically".

On Mon, Mar 4, 2013 at 12:22 PM, Matt Foley <[EMAIL PROTECTED]> wrote:> Konstantine, you have voted -1, and stated some requirements before you'll> withdraw that -1. As I plan to do work to fulfill those requirements, I> want to make sure that what I'm proposing will, in fact, satisfy you.> That's why I'm asking, if we implement full "test-patch" integration for> Windows, does it seem to you that that would provide adequate support?

Yes.

> I have learned not to presume that my interpretation is correct. My> interpretation of item #1 is that test-patch provides pre-commit build, so> it would satisfy item #1. But rather than assuming that I am interpreting> it correctly, I simply want your agreement that it would, or if not,> clarification why it won't.

I agree it will satisfy my item #1.I did not agree in my previous email, but I changed my mind based onthe latest discussion. I have to explain why now.I was proposing nightly build because I did not want pre-commit buildfor Windows block commits to Linux. But if people are fine just ignoring-1s for the Windows part of the build it should be good.

> Regarding item #2, it is also my interpretation that test-patch provides an> on-demand (perhaps 20-minutes deferred) Jenkins build and unit test, with> logs available to the developer, so it would satisfy item #2. But rather> than assuming that I am interpreting it correctly, I simply want your> agreement that it would, or if not, clarification why it won't.

It will satisfy my item #2 in the following way:I can duplicate your pre-commit build for Windows and add an inputparameter, which would let people run the build on their patcheschosen from local machine rather than attaching them to Jiras.

Thanks,--Konstantin

> In agile terms, you are the Owner of these requirements. Please give me> owner feedback as to whether my proposed work sounds like it will satisfy> the requirements.>> Thank you,> --Matt>>> On Sun, Mar 3, 2013 at 12:16 PM, Konstantin Shvachko <[EMAIL PROTECTED]>> wrote:>>>> Didn't I explain in details what I am asking for?>>>> Thanks,>> --Konst>>>> On Sun, Mar 3, 2013 at 11:08 AM, Matt Foley <[EMAIL PROTECTED]>>> wrote:>> > Hi Konstantin,>> > I'd like to point out two things:>> > First, I already committed in this thread (email of Thu, Feb 28, 2013 at>> > 6:01 PM) to providing CI for Windows builds. So please stop acting like>> > I'm>> > resisting this idea or something.>> > Second, you didn't answer my question, you just kvetched about the>> > phrasing.>> > So I ask again:>> >>> > Will providing full "test-patch" integration (pre-commit build and unit>> > test>> > triggered by Jira "Patch Available" state) satisfy your request for>> > functionality #1 and #2? Yes or no, please.>> >>> > Thanks,>> > --Matt>> >>> >>> > On Sat, Mar 2, 2013 at 7:32 PM, Konstantin Shvachko>> > <[EMAIL PROTECTED]>>> > wrote:>> >>>> >> Hi Matt,>> >>>> >> On Sat, Mar 2, 2013 at 12:32 PM, Matt Foley <[EMAIL PROTECTED]>>> >> wrote:>> >> > Konstantin,>> >> > I would like to explore what it would take to remove this perceived>> >> > impediment -->> >>>> >> Glad you decided to explore. Thank you.>> >>>> >> > although I reserve the right to argue that this is not>> >> > pre-requisite to merging the cross-platform support patch.>> >>>> >> It's your right indeed. So as mine to question what the platform>> >> support means for you, which I believe remained unclear.>> >> I do not impede the change as you should have noticed. My requirement>> >> comes from my perception of the support, which means to me exactly two>> >> things:>> >> 1. The ability to recognise the code is broken for the platform>> >> 2. The ability to test new patches on the platform>> >> The latter is problematic, as many noticed in this thread, for those>> >> whose customary environment does not include Windows.>> >>>> >> > If we implemented full "test-patch" support for Windows on trunk,

> On Mon, Mar 4, 2013 at 12:22 PM, Matt Foley <[EMAIL PROTECTED]>> wrote:> > Konstantine, you have voted -1, and stated some requirements before> you'll> > withdraw that -1. As I plan to do work to fulfill those requirements, I> > want to make sure that what I'm proposing will, in fact, satisfy you.> > That's why I'm asking, if we implement full "test-patch" integration for> > Windows, does it seem to you that that would provide adequate support?>> Yes.>> > I have learned not to presume that my interpretation is correct. My> > interpretation of item #1 is that test-patch provides pre-commit build,> so> > it would satisfy item #1. But rather than assuming that I am> interpreting> > it correctly, I simply want your agreement that it would, or if not,> > clarification why it won't.>> I agree it will satisfy my item #1.> I did not agree in my previous email, but I changed my mind based on> the latest discussion. I have to explain why now.> I was proposing nightly build because I did not want pre-commit build> for Windows block commits to Linux. But if people are fine just ignoring> -1s for the Windows part of the build it should be good.>> > Regarding item #2, it is also my interpretation that test-patch provides> an> > on-demand (perhaps 20-minutes deferred) Jenkins build and unit test, with> > logs available to the developer, so it would satisfy item #2. But rather> > than assuming that I am interpreting it correctly, I simply want your> > agreement that it would, or if not, clarification why it won't.>> It will satisfy my item #2 in the following way:> I can duplicate your pre-commit build for Windows and add an input> parameter, which would let people run the build on their patches> chosen from local machine rather than attaching them to Jiras.>> Thanks,> --Konstantin>> > In agile terms, you are the Owner of these requirements. Please give me> > owner feedback as to whether my proposed work sounds like it will satisfy> > the requirements.> >> > Thank you,> > --Matt> >> >> > On Sun, Mar 3, 2013 at 12:16 PM, Konstantin Shvachko <> [EMAIL PROTECTED]>> > wrote:> >>> >> Didn't I explain in details what I am asking for?> >>> >> Thanks,> >> --Konst> >>> >> On Sun, Mar 3, 2013 at 11:08 AM, Matt Foley <[EMAIL PROTECTED]>> >> wrote:> >> > Hi Konstantin,> >> > I'd like to point out two things:> >> > First, I already committed in this thread (email of Thu, Feb 28, 2013> at> >> > 6:01 PM) to providing CI for Windows builds. So please stop acting> like> >> > I'm> >> > resisting this idea or something.> >> > Second, you didn't answer my question, you just kvetched about the> >> > phrasing.> >> > So I ask again:> >> >> >> > Will providing full "test-patch" integration (pre-commit build and> unit> >> > test> >> > triggered by Jira "Patch Available" state) satisfy your request for> >> > functionality #1 and #2? Yes or no, please.> >> >> >> > Thanks,> >> > --Matt> >> >> >> >> >> > On Sat, Mar 2, 2013 at 7:32 PM, Konstantin Shvachko> >> > <[EMAIL PROTECTED]>> >> > wrote:> >> >>> >> >> Hi Matt,> >> >>> >> >> On Sat, Mar 2, 2013 at 12:32 PM, Matt Foley <[EMAIL PROTECTED]>> >> >> wrote:> >> >> > Konstantin,> >> >> > I would like to explore what it would take to remove this perceived> >> >> > impediment --> >> >>> >> >> Glad you decided to explore. Thank you.> >> >>> >> >> > although I reserve the right to argue that this is not> >> >> > pre-requisite to merging the cross-platform support patch.> >> >>> >> >> It's your right indeed. So as mine to question what the platform> >> >> support means for you, which I believe remained unclear.> >> >> I do not impede the change as you should have noticed. My requirement> >> >> comes from my perception of the support, which means to me exactly

Ok, looks like we are converging on this across a few hundred emails ;)

So, as has been stated elsewhere: test-patch will be improved to fully supportWindows; furthermore -1 from Windows' test-patch won't block Linux commits.This is ok with me.

Can we have a JIRA ticket for that test-patch work assigned to the real owner,so it can be tracked? I am +1 in this case.

Cos

On Fri, Mar 01, 2013 at 04:57PM, Chris Douglas wrote:> On Fri, Mar 1, 2013 at 1:57 PM, Konstantin Shvachko> <[EMAIL PROTECTED]> wrote:> > Commitment is a good thing.> > I think the two builds that I proposed are a prerequisite for Win support.> > If we commit windows patch people will start breaking it the next day.> > Which we wont know without the nightly build and wont be able to fix> > without the on-demand one.> > As several people have pointed out already, the surface of possible> conflicts is relatively limited, and- as you did in 2007- the devs on> Windows will report and fix bugs in that platform as they find them.> CI is important for detecting and preventing bugs, but this isn't> software we're launching into orbit.> > > Making two builds is less than 2 days work, imho, given that there is> > a Windows node available and that mvn targets are in place. Correct me> > if I missed any complications in the process.> > On Fri, Mar 1, 2013 at 3:47 PM, Konstantin Boudnik <[EMAIL PROTECTED]> wrote:> > It seems that with the HW in place, the matter of setting at least nightly> > build is trivial for anyone with up to date Windows knowledge. I wish I could> > help. Going without a validation is a recipe for a disaster IMO.> > Fair enough, though that also implies that the window for regressions> is small, and it leaves little room to doubt that this will receive> priority. Until it's merged, spurious notifications that the current> trunk breaks Windows are an awkward introduction to devs' workflow.> The order of merge/CI is a choice between mild annoyances, really.> > But it might be moot. Giri: you're doing the work on this. When do you> think it can be complete? -C

On Mon, Mar 4, 2013 at 11:39 PM, Suresh Srinivas <[EMAIL PROTECTED]> wrote:> On Sun, Mar 3, 2013 at 8:50 PM, Harsh J <[EMAIL PROTECTED]> wrote:>>> Have we agreed (and stated it somewhere proper) that a -1 obtained for>> a Windows CI build for a test-patch will not block the ongoing work>> (unless it is Windows specific) and patches may still be committed to>> trunk despite that?>>>> This thread is long and possibly hard to follow. Yes, I and several others> have> stated that for now it is okay to commit even if Windows precommit build> posts -1.>>>>> I'm +1 if someone can assert and add the above into the formal>> guidelines. I'd still prefer that Windows does its releases separately>> as that ensures more quality for its audience and better testing>> periods (and wouldn't block anything), but we can come to that iff we>> are unable to maintain the currently proposed model.>>> Which do you think is the right place to add this?>> At this time we are voting for merging into trunk. I prefer having a single> release> that supports both Linux and windows. Based on working on Windows support> I think this is doable and should not hold up releases for Linux.

> Thanks Suresh. Regarding where; we can state it on> http://wiki.apache.org/hadoop/HowToContribute in the test-patch> section perhaps.>> +1 on the merge.>> On Mon, Mar 4, 2013 at 11:39 PM, Suresh Srinivas <[EMAIL PROTECTED]>> wrote:> > On Sun, Mar 3, 2013 at 8:50 PM, Harsh J <[EMAIL PROTECTED]> wrote:> >> >> Have we agreed (and stated it somewhere proper) that a -1 obtained for> >> a Windows CI build for a test-patch will not block the ongoing work> >> (unless it is Windows specific) and patches may still be committed to> >> trunk despite that?> >>> >> > This thread is long and possibly hard to follow. Yes, I and several> others> > have> > stated that for now it is okay to commit even if Windows precommit build> > posts -1.> >> >>> >> I'm +1 if someone can assert and add the above into the formal> >> guidelines. I'd still prefer that Windows does its releases separately> >> as that ensures more quality for its audience and better testing> >> periods (and wouldn't block anything), but we can come to that iff we> >> are unable to maintain the currently proposed model.> >> >> > Which do you think is the right place to add this?> >> > At this time we are voting for merging into trunk. I prefer having a> single> > release> > that supports both Linux and windows. Based on working on Windows support> > I think this is doable and should not hold up releases for Linux.>>>> --> Harsh J>

>> Windows is so different from _any_ Unix or pseudo-Unix flavors, including> Windows with Cygwin - that even multi-platform Java has hard hard time> dealing with it. This is enough, IMO, to warrant a separate checkpoint.>>Cygwin is the worst of both worlds: not Unix, not windows. Dropping it forproper windows is much better. Even dropping it altogether would be better.We hate cygwin problems in Ant as users have unrealistic expectations aboutthe filesystem and how programs run.

CI-wise, it'd be good to have nightly builds and a preflight check perJIRA. It sounds like the consensus that is evolving is (in RFC 2119 terms)

1. CI test runs on Windows SHALL be provided (Matt has promised this) 2. A patch with Pass(Linux) && Fail(windows) tests MAY be committed 3. A patch with Pass(Linux) && Fail(windows) tests SHALL be fixed -but not necessarily by the author of the original patch 4. A patch with Pass(windows) && Fail(linux) tests MUST NOT be committed 5. * It is assumed that if it works on Linux, it SHOULD work on other Unix 6. A patch with Pass(other-unix) && Fail(linux) tests MUST NOT be committed (this has never arisen that I know of). This could be merged with (3) to state that: all patches must Pass(Linux). 7. The unix-side operation MAY BE optimised for Linux at the expense of other Unices (I remember that for exec()/fork() a way, way back). 8. The unix-side operation MAY contain features that only work on Linux (YARN-3 cgroups are an example of this) 9. A patch that is optimised for Linux SHOULD have a Windows alternative (c.f. local sockets). The alternative MAY be java code that substitutes for native C/C++/assembler

That mention of native code raises another question: CPU support.

Now that there's some ARM boxes running Hadoop on Jenkins, perhaps as AMDramp up their ARM story and IBM continue with Power, and some new x86 ASMcode is coming from Intel, we ought to have a policy there, something like

1. Hadoop MAY contain code that works best on x86 systems. 2. Hadoop MAY contain code that works best on ARM systems. 3. Hadoop MUST NOT contain code that only works on on a single CPU family 4. Hadoop SHOULD NOT contain code that works best on on a single CPU family AND makes performance worse on other CPU families. E.G we shouldn't mandate CRC, compression and encryption algorithms that speed up on one CPU family yet are significantly worse on other CPU families than a platform-neutral algorithm. 5. Hadoop MUST NOT contain code that has assumptions about the CPU memory model other than the java 5+ memory model [http://www.cs.umd.edu/~pugh/java/memoryModel/] . This is automatic for Java code, but needs to be included in native code, as volatile is not a barrier operation in C/C++, some CPUs implement different barrier op 6. Hadoop MUST NOT contain code that has hard-code assumptions about cache other than CPUs implement coherency across cores. No hard coded assumptions about cache line sizes, write-through vs. write back. RAM could be NUMA, but MUST be consistent w.r.t. the causal model of the happens-before semantics of the java 5+ memory model. (if you didn't understand that, read up on memory models)

That could be teased out for a separate discussion and vote, along with-maybe- a policy w.r.t non-HFDS filesystems (which could be SHOULD notbreak other FSs, MAY reduce performance, patches MAY break (without tests,who knows?). People implementing alternative filesystems and assertingcompatibility with Hadoop SHOULD run their own CI tests.