Hadoop QA
added a comment - 25/Sep/15 21:23 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12685/console in case of problems.

Hadoop QA
added a comment - 26/Sep/15 08:51 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12689/console in case of problems.

Hadoop QA
added a comment - 26/Sep/15 17:50 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12692/console in case of problems.

Hadoop QA
added a comment - 26/Sep/15 20:03 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12694/console in case of problems.

Hadoop QA
added a comment - 27/Sep/15 08:58 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12701/console in case of problems.

Hadoop QA
added a comment - 27/Sep/15 14:28 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12702/console in case of problems.

Hadoop QA
added a comment - 27/Sep/15 20:09 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12703/console in case of problems.

Hadoop QA
added a comment - 08/Oct/15 05:52 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12853/console in case of problems.

Hadoop QA
added a comment - 08/Oct/15 08:40 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12859/console in case of problems.

Hadoop QA
added a comment - 08/Oct/15 13:39 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12863/console in case of problems.

Hadoop QA
added a comment - 09/Oct/15 14:15 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12890/console in case of problems.

Hadoop QA
added a comment - 12/Oct/15 06:42 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12929/console in case of problems.

Hadoop QA
added a comment - 12/Oct/15 13:57 A patch to the files used for the QA process has been detected.
Re-executing against the patched versions to perform further tests.
The console is at https://builds.apache.org/job/PreCommit-HDFS-Build/12932/console in case of problems.

Vinayakumar B
added a comment - 12/Oct/15 14:01 I think, from recent runs, parallel-test runs looks to be quite stable.
Total build time reduces to ~90 mins from ~250 min.
I think its okay to commit current changes and fix further random issues in follow on jiras.
Chris Nauroth , Steve Loughran , Haohui Mai Do you agree?

Walter Su
added a comment - 15/Oct/15 10:25 Hi, guys! I have a question: Is it parallel in test suite(class), or parallel in test case?
It looks like it's parallel in test case.
https://builds.apache.org/job/PreCommit-HDFS-Build/12871/testReport/org.apache.hadoop.hdfs/TestSafeModeWithStripedFile/
https://builds.apache.org/job/PreCommit-HDFS-Build/12998/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockTokenWithDFSStriped/
These two classes are stable in feature branch. they fail often recently. I found that the running time of two cases in one suite collide.

I think, you found this by checking logs for both passed tests and failed test in the same TestSuite right?

The report shows entire testsuite's logs for passed tests, and only particular testcase's logs for failed tests.
i.e. Failed Tests' logs also will be included in the passed tests' logs. Thats how you are seeing the collision. and thinking testcases runs in parallel.
Actually if you directly see the xml reports generated by sure-fire, there will not be any logs test-casewise for passed testcases.

And For the tests which fails with parallel tests enabled,
there can be 3 cases,
1. Port Binding issue, Ideally any port should not be hard coded. But some tests, expect DNs/NNs to restart in same ports, this could be the case.
2. Files written outside of test.build.data directory.
3. Waiting time for some events. In parallel tests, it needs to wait little more, since CPUs of build machines will be busy.

If any of tests fails with some other reasons, then might be some functional issue.

Vinayakumar B
added a comment - 15/Oct/15 11:44 AFAIK, Its testsuite in parallel.
It looks like it's parallel in test case.
I think, you found this by checking logs for both passed tests and failed test in the same TestSuite right?
The report shows entire testsuite's logs for passed tests, and only particular testcase's logs for failed tests.
i.e. Failed Tests' logs also will be included in the passed tests' logs. Thats how you are seeing the collision. and thinking testcases runs in parallel.
Actually if you directly see the xml reports generated by sure-fire, there will not be any logs test-casewise for passed testcases.
And For the tests which fails with parallel tests enabled,
there can be 3 cases,
1. Port Binding issue, Ideally any port should not be hard coded. But some tests, expect DNs/NNs to restart in same ports, this could be the case.
2. Files written outside of test.build.data directory.
3. Waiting time for some events. In parallel tests, it needs to wait little more, since CPUs of build machines will be busy.
If any of tests fails with some other reasons, then might be some functional issue.

Vinay is correct. I specifically chose parallel-per-suite instead of parallel-per-test. Isolation for our tests is already tricky enough when considered just at the suite class level without trying to run individual tests in parallel.

It would be great if people with an itch to write some patches could jump on any flaky tests that might have been exposed by running in parallel.

Chris Nauroth
added a comment - 15/Oct/15 14:19 Vinay is correct. I specifically chose parallel-per-suite instead of parallel-per-test. Isolation for our tests is already tricky enough when considered just at the suite class level without trying to run individual tests in parallel.
It would be great if people with an itch to write some patches could jump on any flaky tests that might have been exposed by running in parallel.