Create target for 10 minute patch test build for mapreduce

Details

Added a new target 'test-commit' to the build.xml file which runs tests specified in the file src/test/commit-tests. The tests specified in src/test/commit-tests should provide maximum coverage and all the tests should run within 10mins.

Activity

As a process of getting to this target, one of the activities was to identify a subset of tests that would run in close to 10 minutes and would have a reasonable coverage. The idea is to have the main core flow/ core components well tested by the fast test suite and the corner cases/library classes could afford to be left out. Needless to say, the entire test suite will be a part of the nightly test cycle.

Attaching a spread sheet as a first step in this identification/classification of test suites into Fast Tests. The run time of these tests is marginally over 10 minutes (The entire test suite runs in about 2 hours). Effort is still on to improve a few more test cases that would hopefully bring this run time down even further.

The attached file has three sheets –

Proposed set of tests that could make up the Fast Test suite,

Analysis of existing tests in the mapred package and whether they have been considered for fast tests or not, with some notes

List of classes where the coverage drop between the fast tests and all tests is greater than 10%. This also has information of where the drop is coming from

Jothi Padmanabhan
added a comment - 29/Jun/09 03:25 As a process of getting to this target, one of the activities was to identify a subset of tests that would run in close to 10 minutes and would have a reasonable coverage. The idea is to have the main core flow/ core components well tested by the fast test suite and the corner cases/library classes could afford to be left out. Needless to say, the entire test suite will be a part of the nightly test cycle.
Attaching a spread sheet as a first step in this identification/classification of test suites into Fast Tests. The run time of these tests is marginally over 10 minutes (The entire test suite runs in about 2 hours). Effort is still on to improve a few more test cases that would hopefully bring this run time down even further.
The attached file has three sheets –
Proposed set of tests that could make up the Fast Test suite,
Analysis of existing tests in the mapred package and whether they have been considered for fast tests or not, with some notes
List of classes where the coverage drop between the fast tests and all tests is greater than 10%. This also has information of where the drop is coming from
Clover was used for measuring code coverage.
Please share your thoughts/suggestions/ideas

Nigel Daley
added a comment - 01/Jul/09 06:40 This looks great. Can you go ahead and list these tests into a Junit test suite class? I think that is the best way for now to capture them into a single suite that is run by the commit build.

I spoke with Nigel and, as discussed in HDFS-458, rather than a junit test suite class, a flat file included into the build.xml file might be a better solution. It looks like that's the way we're going on the hdfs side.

Jakob Homan
added a comment - 01/Jul/09 23:40 I spoke with Nigel and, as discussed in HDFS-458 , rather than a junit test suite class, a flat file included into the build.xml file might be a better solution. It looks like that's the way we're going on the hdfs side.

I've yet another conversation with Jacob wrt HDFS-458 and HDFS-505 and I'm going to agree with his approach: an external configuration file (i.e. test list) seems to provide a reasonable compromise between ease of use and maintenance headache. External test list file allows to make changes of a suite content without any actual source code modifications, which is a bonus for someone not familiar with Java language.

On the other hand, a separate JUnit test suite won't give us any extra benefits compare to a flat test file.

Konstantin Boudnik
added a comment - 28/Jul/09 21:16 I've yet another conversation with Jacob wrt HDFS-458 and HDFS-505 and I'm going to agree with his approach: an external configuration file (i.e. test list) seems to provide a reasonable compromise between ease of use and maintenance headache. External test list file allows to make changes of a suite content without any actual source code modifications, which is a bonus for someone not familiar with Java language.
On the other hand, a separate JUnit test suite won't give us any extra benefits compare to a flat test file.
+1 on HDFS-458 approach.

I have left out two tests (TestJobTrackerRestart and TestQueueManager) out of this list as these take about 7 and 6 minutes respectively. Separate effort is underway to refactor these to become unit tests and when completed, these tests will be added to the commit-tests list. The current tests run in less than 9 minutes, so adding these two tests, after they have been refactored, should still keep the run time down to 10 minutes.

Jothi Padmanabhan
added a comment - 30/Jul/09 04:54 Patch fixing the indentation issue pointed out by Konstantin.
A few additional points:
Code coverage (for the mapred package) of all-tests list is 76%.
Code coverage (for the mapred package) of commit-tests list is 59%
I have left out two tests (TestJobTrackerRestart and TestQueueManager) out of this list as these take about 7 and 6 minutes respectively. Separate effort is underway to refactor these to become unit tests and when completed, these tests will be added to the commit-tests list. The current tests run in less than 9 minutes, so adding these two tests, after they have been refactored, should still keep the run time down to 10 minutes.
Code coverage (for the mapred package) of commit-tests + TestJobTrackerRestart + TestQueueManager (as they exist now) is 63%

Lee Tucker
added a comment - 03/Aug/09 16:27 The patch shows that there should be a file called commit-tests as part of the patch, but when I go to the tip of the trunk, there is no such file commited.