Path.normalize should use StringUtils.replace in favor of String.replace

Details

Description

in our environment, we are seeing that the JobClient is going out of memory because Path.normalizePath(String) is called several tens of thousands of times, and each time it calls "String.replace" twice.

java.lang.String.replace compiles a regex to do the job which is very costly.
We should use org.apache.commons.lang.StringUtils.replace which is much faster and consumes almost no extra memory.

Activity

I noticed Path#normalizePath and Shard#normalizePath have diverged, perhaps worth merging them while your at it so you fix the performance issue in both places. Sounds fishy that normalizePath is called tens of thousands of times, perhaps not necessary or could cache the result if it's being called frequently on the same path.

Eli Collins
added a comment - 13/Jan/10 03:02 I noticed Path#normalizePath and Shard#normalizePath have diverged, perhaps worth merging them while your at it so you fix the performance issue in both places. Sounds fishy that normalizePath is called tens of thousands of times, perhaps not necessary or could cache the result if it's being called frequently on the same path.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

Hadoop QA
added a comment - 19/Sep/11 20:53 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12495125/HADOOP-6490.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in .
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/204//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/204//console
This message is automatically generated.