After the Baseline

We now have three routes we can follow in proceeding with the micro-tuning:

Consider the inefficiencies highlighted by the baseline measurements.

Analyze the algorithm to determine if there are generic inefficiencies that we can improve.

Profile the method to determine where the bottlenecks are, if there are any.

These three routes are generic; micro-tuning generally has these options, though any one option may become unavailable when the code becomes more efficient. Usually, some of the options are easier to proceed with than others, but for this article it is instructive to look at all three.

Dataset 3

The baseline measurement shows that using Dataset 3 required a much longer time to execute than using the other two datasets. The major difference in this dataset is the existence of strings that do not hold valid numbers. We can easily follow the logic for this case: the method tries to create an Integer object by parsing the string using the Integer constructor. In cases where the string is not a number, the constructor throws a NumberFormatException. This is caught in our catch block, and false is returned. So it seems that this procedure is extremely expensive.

In fact, it is fairly well known that creating an exception is an expensive procedure. The actual cost comes from filling out the stack trace that the exception holds, and even though we never care about the stack trace in this method, we cannot avoid its cost. We can, however, avoid the exception completely by checking that we have a valid number before creating the Integer. This is simply done by checking all of the characters to make sure they are all digits:

for (int i = 0; i < testInteger.length(); i++)
if (!Character.isDigit(testInteger.charAt(i)))
return false;

Note that negative numbers will return false using this loop, but that is acceptable, since the method in any case always returns false when passed a negative number. Re-running our test with this code added to the beginning of the method produces the results listed in Table 2.

Table 2: After adding the looped digit test

Dataset 1

Dataset 2

Dataset 3

Baseline

100%

84.1%

540.0%

With looped digit test

114.8%

66.1%

51.4%

So we've improved the test using Dataset 3 to be faster than the tests using both the other datasets, in line with what we expected. The test using Dataset 2 is also improved, which I didn't expect.

The Algorithm

Let's look at the algorithm for any inefficiencies. The algorithm has one obvious redundancy: we check that the integer is greater than 10 and, if it is, we subsequently check that it is greater than or equal to 2.

(theInteger.intValue() > 10) && ((theInteger.intValue() >= 2)

Clearly, we can dispense with the second test completely. If the first (>10) test fails, then the following (>=2) will not be executed, since we are using the shortcut Boolean AND operator (&&), which only executes its right-hand side if the left-hand side is true. If the "greater than 10" test succeeds, then we know definitely that the "greater than or equal to 2" test will succeed.
Are there further inefficiencies? Let's go line-by-line. For convenience, here's the full function again.

First we create an Integer (5). Then we test for an empty string (7). But on checking the Integer class, an empty string is an invalid number for the Integer constructor, so if we do have an empty string, this test will never be executed. If an empty string is passed to the method, the Integer constructor will throw an exception and the catch block will be immediately activated. So we should either eliminate the empty string test, or move it before the Integer creation. It's highly likely that a simple empty string test is faster than the Integer creation procedure, so we should move the test to before the Integer creation. (Note that the test was, in any case, incorrect in using the identity operator != rather than the equals() method, and I'll correct that here.)

Moving on, the "greater than 10" test (8) is necessary, and we've already eliminated the "greater than or equal to 2" test (9). The "less than or equal to 100000" (10) is necessary, and the test for the first digit being 3 (11) is necessary.

If the first digit is 3, however, the smallest possible valid number is 30 (since the number must be greater than 10); thus the "greater than 10" test should become a "greater than 29" test. Similarly, the largest number possible up to 100000, that starts with a 3, is 39999, so the "less than or equal to 100000" test should become "less than 40000."

Finally, the test for the first digit being 3 is surely simpler to execute than creating the Integer. To create the Integer, the minimum that would need to be executed would be to check that every character is a digit. So it makes sense to move the "first digit being 3" test to before the creation of the Integer.