Details

Description

By using a ServiceLoader users wouldn't have to add codec classes to io.compression.codecs for codecs that aren't shipped with Hadoop (e.g. LZO), since they would be automatically picked up from the classpath.

Hadoop QA
added a comment - 02/Jun/11 01:22 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12481172/HADOOP-7350.patch
against trunk revision 1129989.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/557//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/557//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/557//console
This message is automatically generated.

We should probably remove the codecs from core-default.xml now that they're loaded via ServiceLoader

Is there a way to inject a new codec programatically through the ServiceLoader interface? If so, we could entirely deprecate io.compression.codecs. If not, maybe we should rename it to something like io.compression.extra.codecs and specify that it's only necessary if you have a codec that doesn't expose itself through ServiceLoader?

hdfs-default.xml has an item dfs.image.compression.codec that needs to be updated

Todd Lipcon
added a comment - 02/Jun/11 02:24
We should probably remove the codecs from core-default.xml now that they're loaded via ServiceLoader
Is there a way to inject a new codec programatically through the ServiceLoader interface? If so, we could entirely deprecate io.compression.codecs. If not, maybe we should rename it to something like io.compression.extra.codecs and specify that it's only necessary if you have a codec that doesn't expose itself through ServiceLoader?
hdfs-default.xml has an item dfs.image.compression.codec that needs to be updated

We should probably remove the codecs from core-default.xml now that they're loaded via ServiceLoader

Done - see new patch.

Is there a way to inject a new codec programatically through the ServiceLoader interface? If so, we could entirely deprecate io.compression.codecs. If not, maybe we should rename it to something like io.compression.extra.codecs and specify that it's only necessary if you have a codec that doesn't expose itself through ServiceLoader?

I don't think we need to deprecate or rename io.compression.codecs - it's just used to specify additional codecs to the ones that are loaded through a ServiceLoader. Note that duplicates are ignored, so there's no problem with users older configs having codecs that could be loaded through ServiceLoader.

hdfs-default.xml has an item dfs.image.compression.codec that needs to be updated

This doesn't need to be updated, although with HADOOP-7323 (and a corresponding HDFS change) it could be changed to "default".

Tom White
added a comment - 02/Jun/11 22:18
We should probably remove the codecs from core-default.xml now that they're loaded via ServiceLoader
Done - see new patch.
Is there a way to inject a new codec programatically through the ServiceLoader interface? If so, we could entirely deprecate io.compression.codecs. If not, maybe we should rename it to something like io.compression.extra.codecs and specify that it's only necessary if you have a codec that doesn't expose itself through ServiceLoader?
I don't think we need to deprecate or rename io.compression.codecs - it's just used to specify additional codecs to the ones that are loaded through a ServiceLoader. Note that duplicates are ignored, so there's no problem with users older configs having codecs that could be loaded through ServiceLoader.
hdfs-default.xml has an item dfs.image.compression.codec that needs to be updated
This doesn't need to be updated, although with HADOOP-7323 (and a corresponding HDFS change) it could be changed to "default".

Hadoop QA
added a comment - 02/Jun/11 22:33 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12481287/HADOOP-7350.patch
against trunk revision 1130758.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/563//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/563//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/563//console
This message is automatically generated.

Sorry, the default doesn't need to change, but the docs should be updated: it no longer needs to be one of the codecs listed in io.compression.codecs. It just needs to be any codec that's "registered" (either via conf or via ServiceLoader). I don't know the best verbiage to succinctly document it, though.

Todd Lipcon
added a comment - 02/Jun/11 22:35 This doesn't need to be updated
Sorry, the default doesn't need to change, but the docs should be updated: it no longer needs to be one of the codecs listed in io.compression.codecs. It just needs to be any codec that's "registered" (either via conf or via ServiceLoader). I don't know the best verbiage to succinctly document it, though.
The rest of your points make sense.

Hadoop QA
added a comment - 03/Jun/11 00:53 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12481304/HADOOP-7350.patch
against trunk revision 1130833.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/565//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/565//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/565//console
This message is automatically generated.

Hadoop QA
added a comment - 06/Jun/11 23:03 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12481630/HADOOP-7350.patch
against trunk revision 1132776.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/586//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/586//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/586//console
This message is automatically generated.

I moved it to take advantage of the ServiceLoader's caching of providers. This new patch does that but moves the iteration back to the getCodecClasses() method. (This is now like the example in the ServiceLoader documentation.)

Tom White
added a comment - 07/Jun/11 18:56 I moved it to take advantage of the ServiceLoader's caching of providers. This new patch does that but moves the iteration back to the getCodecClasses() method. (This is now like the example in the ServiceLoader documentation.)

Hadoop QA
added a comment - 07/Jun/11 19:13 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12481726/HADOOP-7350.patch
against trunk revision 1132776.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/587//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/587//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/587//console
This message is automatically generated.

Tom White
added a comment - 20/Jan/12 00:36 Yes, this got forgotten. I've updated the patch to the new source structure, and added a couple of tests to check that Snappy and LZ4 extensions are picked up.

Hadoop QA
added a comment - 24/Jan/12 20:10 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12511199/HADOOP-7350.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 javadoc. The javadoc tool appears to have generated 7 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in .
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/528//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/528//console
This message is automatically generated.

Tom, thanks for reviving this one. Things look OK, the only thing is that I'd revert the logic to first load the codecs defined as services and after that the ones defined in Hadoop's configuration. By doing that, it is possible for a user to override a provided implementation.

Alejandro Abdelnur
added a comment - 13/Feb/12 20:06 Tom, thanks for reviving this one. Things look OK, the only thing is that I'd revert the logic to first load the codecs defined as services and after that the ones defined in Hadoop's configuration. By doing that, it is possible for a user to override a provided implementation.

Tom White
added a comment - 13/Feb/12 21:00 Thanks for taking a look, Alejandro. Here's an updated patch in which codecs defined in the configuration take precedence over those picked up by the service loader.

I may have expressed incorrectly, my suggestion was that in the getCodecClasses(Configuration) method, the loop loading the codecs from CODEC_PROVIDERS should be done before loading the codecs from the Hadoop 'io.compression.codec' property. Also, in the CompressionCodecFactory() constructor, doing a reverse of the list seems wrong as that would effectively put the service loaded ones first. I think that the reverse should not be there, then the list of codecs is ALL_SERVICE_CODECS (in non-deterministic order) and then the configuration set ones, in order or appearance where the last one overrides.

Alejandro Abdelnur
added a comment - 14/Feb/12 06:05 I may have expressed incorrectly, my suggestion was that in the getCodecClasses(Configuration) method, the loop loading the codecs from CODEC_PROVIDERS should be done before loading the codecs from the Hadoop 'io.compression.codec' property. Also, in the CompressionCodecFactory() constructor, doing a reverse of the list seems wrong as that would effectively put the service loaded ones first. I think that the reverse should not be there, then the list of codecs is ALL_SERVICE_CODECS (in non-deterministic order) and then the configuration set ones, in order or appearance where the last one overrides.

Hadoop QA
added a comment - 23/Mar/12 22:53 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12514395/HADOOP-7350.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/768//console
This message is automatically generated.

Tom White
added a comment - 13/Apr/12 19:48 Updated patch to address Alejandro's feedback. Note that codecs defined in the configuration still take precedence over those picked up by the service loader.

Hadoop QA
added a comment - 13/Apr/12 20:05 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12522601/HADOOP-7350.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.fs.viewfs.TestViewFsTrash
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/850//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/850//console
This message is automatically generated.