There are multiple reasons why I'm asking this question, and I realize that it doesn't have an easy answer - accurately benchmarking code is difficult. I’m hoping I can craft this post in such a way that it can be helpful to everyone now and in the future and not be considered too broad.

With new releases coming out from Salesforce three times a year, what was once a CPU-intensive method may be improved in future versions and conversely, new operations could be introduced that blow your limits out of the water. So please note that any mentions of what operations in Apex are CPU-intensive should be re-confirmed regularly.

My goals from this question are as follows:

While there are many questions out there around CPU timeout exceptions, I don't see anything asking about what to specifically avoid doing before a problem actually happens. (Maybe I didn't search hard enough?)

I'm in the process of writing performance rules into the PMD engine and ApexMetrics projects (see pmd and codeclimate-apexmetrics on GitHub), and I'd like to gather feedback on what I should include as a rule. It would be great if some others were able to do their own benchmarking and give me ideas for what I should include.

Dan Appleman and I presented this topic at DF16 and gave some guidance on how to do your own testing. I’d like to share that knowledge here (in case you missed the session), and I'm interested to know if there have been any additional findings that I'm not already aware of.

1 Answer
1

Benchmarking Code

With a small amount of code you can do some testing to figure out what is eating up that precious CPU limit. (Check out How does SF calculate the CPU time? for reference on what Salesforce counts towards this limit.)

To help get started, I would recommend installing the LimitsProfiler package that Adrian Larson has developed. By using the code provided in this unmanaged package, you no longer need to deal with setting the appropriate logging levels when debugging. In fact, you don’t need to use debug logs at all. (Note that this unmanaged package contains several classes, pages, a custom object, and a custom settings object, and should never be installed into a production environment.)

After saving the class, navigate to /apex/LimitsProfiler and click the "Configure" button to configure a new test run. The Profiler Type should be set to LimitsTesting.AssignmentProfiler and Iterations should be set to 1,000,000. Only the Display CPU Time? checkbox needs to be checked. Saving the configuration will take you back to the main page.

From the main page, take five or more measurements and then calculate the average CPU time across the tests. Record the calculated average, and if you wish, save your results. (Saving the results stores them in a custom object that you can reference in a report later, if you need them.)

You will then perform three more tests, following the directions in the comments of the above class. (For the fourth test, you will likely need to drop the number of iterations down to 100,000 or 10,000.) When calculating the CPU time for tests #2-4, remember to subtract the average CPU time from the first test. We subtract this number when calculating the results of the other tests because we want to know the CPU time for a single variable assignment, not the CPU time for a variable assignment in a for loop.

Note: We should only subtract 93.9ms from the average in test #4 since we are only performing 100,000 instead of 1,000,000 iterations.

What if I don't want to install an unmanaged package to benchmark CPU time?

More details around benchmarking Apex can be found in the DF16 presentation: The Dark Art of CPU Benchmarking and examples found on my DF16 Github repo. The BenchmarkTests.cls class can be used as a guide if you wish to perform CPU benchmarking without using the LimitsProfiler. One of the most important things to remember when benchmarking CPU time is to make sure that all of your logging levels are turned off except where absolutely necessary.

What are the most CPU intensive operations?

We learned a lot when preparing for our DF ‘16 talk on benchmarking. Here are some of the highlights from what we’ve discovered:

Calling Schema.getGlobalDescribe() is a big one. While Salesforce does its own caching, it actually isn't very good at caching. Calling Schema.getGlobalDescribe() eats up precious milliseconds of CPU time even on subsequent calls.

Using doubles instead of decimals is 200 times faster (half a microsecond vs 100 microseconds) - though make sure you understand the difference when choosing one or the other (Decimal or double?)

Serializing data can be CPU intensive depending on the amount of data being serialized.

Dynamic assignment uses 8+ microseconds of CPU time, whereas a simple assignment or static assignment can be done in far less than a microsecond. Again, this is really only going to be an issue if you’re iterating over a large list.

Good to see this question (and answer) up now! I believe that using the SObject constructor to set fields is even faster than instance.<fieldName> = value; (diminishing returns, though. See my and Adrian's answers on How to avoid instantiating objects in a loop). Also, I think this goes without saying, but if you have two lists of objects that share a common, non-lookup, field, iterating over one list to build a map is much better than a nested loop with if(obj1.field == obj2.field).
– Derek FApr 7 '17 at 14:01

I'm wondering if there are further factors at play there that will make accurately measuring CPU time very difficult. For one thing, Limits.getCpuTime() only measures to the millisecond. Certainly averaging many runs helps, but the random errors and multitentant + multi application server could add all sorts of variance.
– Daniel BallingerApr 13 '17 at 0:28

E.g. Every calculation is heavily dependent on the result of Test 1 as the base loop overhead. I tried repeating Test 1 three times and got Apex CPU measurements of 1176ms, 1093ms, 974ms. This variation would be compounded into all the subsequent results as it is more like (as a very rough example) 1000ms ± 150ms for the base loop.
– Daniel BallingerApr 13 '17 at 0:32

The article Measurements and Error Analysis goes into better detail on factors like Instrument resolution and Environmental factors that will be impacting the precision of the results. I've got other doubts about how accurate Limits.getCpuTime() is with respect to measuring CPU time.
– Daniel BallingerApr 13 '17 at 0:46

Great comments here! I haven't had time to read through all of the articles posted yet, but I'll be sure to do so. @DanielBallinger you made great points - I guess one thing I should clarify is that my results are certainly not 100% accurate since each test always yields different results. I would argue that it does at least allow us to draw some conclusions. In the static vs dynamic variable assignment example I have consistently seen dynamic assignment use considerably more CPU time than static assignment, so from that we can conclude that static assignment in a for loop is preferable.
– Robert WatsonApr 13 '17 at 17:24