Details

Description

DNSToSwitchMapping now has only one API to resolve a host list: public List<String> resolve(List<String> names). But the two major caller: RackResolver.resolve() and DatanodeManager.resolveNetworkLocation() are taking single host name but have to wrapper it to an single entry ArrayList. This is not necessary especially the host has been cached before.

Junping Du
added a comment - 24/Apr/12 07:54 The patch to add resolve(String host) interface to DNSToSwitchMapping, and some related unit tests. Also, identify a potential bug that a hostname start with number may not been resolved properly.

Sounds reasonable to me Junping. test-patch doesn't run on cross-project patches. For this patch just introduce the new interface in common, and then file a separate HDFS and MR jira to use the new API.

Eli Collins
added a comment - 26/Apr/12 01:36 Sounds reasonable to me Junping. test-patch doesn't run on cross-project patches. For this patch just introduce the new interface in common, and then file a separate HDFS and MR jira to use the new API.
Thanks,
Eli

Junping Du
added a comment - 26/Apr/12 04:36 Hello Eli,
I already updated the code to include common part only. The rest code (HDFS/MAPREDUCE) are going to MAPREDUCE-4198 and HDFS-3324 . All are marked patch available for review.
Cheers,
Junping

Hadoop QA
added a comment - 26/Apr/12 04:45 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12524388/HADOOP-8304-V2.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 5 new or modified test files.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.fs.viewfs.TestViewFsTrash
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/889//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/889//console
This message is automatically generated.

I update patches in HDFS-3324 and MAPREDUCE-4198 as previous patch is wrongly including common project code (caused by switching between local branch). I should be more careful next time. Sorry for the trouble.

Junping Du
added a comment - 26/Apr/12 04:59 I update patches in HDFS-3324 and MAPREDUCE-4198 as previous patch is wrongly including common project code (caused by switching between local branch). I should be more careful next time. Sorry for the trouble.

Hadoop QA
added a comment - 26/Apr/12 05:25 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12524393/HADOOP-8304-V2.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 5 new or modified test files.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in .
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/890//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/890//console
This message is automatically generated.

Tom White
added a comment - 26/Apr/12 19:26 This adds a method to an interface which is an incompatible change, and will break implementers. Is the overhead of wrapping a string in a list causing performance problems?

Eli Collins
added a comment - 26/Apr/12 22:09 Good point, forgot that people override DNSToSwitchMapping. Technically DNSToSwitchMapping is evolving (and should stay that way), but yea no sense in breaking people if we don't have to.
Junping, how about introducing the fix in a compatible way with the List interface?

That's good comments on compatibility of this change. I saw evolving tag there but not consider people extends DNSToSwitchMapping as well (just thought ScriptBased, TableMapping and cached are good enough).
I won't say original interface cause some performance headache as the time of resolving rack info can be overwhelmed comparing with the whole flow (replica placement or task scheduling). However, it is more easy to use for major consumers of original interface which are expecting to resolve individual host. Do you see any scenario to resolve a list of host? (not counting the unit test)
Eli, I don't understand the question of last comment there as I just want to fix the interface here.

Junping Du
added a comment - 05/May/12 11:35 That's good comments on compatibility of this change. I saw evolving tag there but not consider people extends DNSToSwitchMapping as well (just thought ScriptBased, TableMapping and cached are good enough).
I won't say original interface cause some performance headache as the time of resolving rack info can be overwhelmed comparing with the whole flow (replica placement or task scheduling). However, it is more easy to use for major consumers of original interface which are expecting to resolve individual host. Do you see any scenario to resolve a list of host? (not counting the unit test)
Eli, I don't understand the question of last comment there as I just want to fix the interface here.

Eli Collins
added a comment - 08/May/12 17:38 Do you see any scenario to resolve a list of host? (not counting the unit test)
DatanodeManager does that today with a list obtained from the hosts files.
I don't understand the question of last comment there as I just want to fix the interface here
You mentioned earlier "identify a potential bug that a hostname start with number may not been resolved properly" - doesn't that issue still need to be addressed?

That's for caching only. If you think list interface here is good, I agree that it may be better to leave it there which will not take complexity of incompatibility.
About the bug that wrongly resolve the hostname starting with number, yes. I will go ahead to file a jira and it could be easy to fix it.

Junping Du
added a comment - 08/May/12 17:54 That's for caching only. If you think list interface here is good, I agree that it may be better to leave it there which will not take complexity of incompatibility.
About the bug that wrongly resolve the hostname starting with number, yes. I will go ahead to file a jira and it could be easy to fix it.

Junping Du
added a comment - 09/May/12 13:30 Hi Eli,
The jira is filed as https://issues.apache.org/jira/browse/HADOOP-8372 . I uploaded a patch to fix this, but that will cause some side-effect.
Thanks,
Junping