14 October 2010

Traceview War Story

I recently took my first serious look at Traceview, and it occurred to me, first, that there are probably a few other Android developers who haven’t used it and, second, that this is an opportunity to lecture sternly on one of my favorite subjects: performance improvement and profiling. This is perhaps a little bit Android-101; If you already know all about Traceview, you can stop here and go back to coding.

Making Apps Fast

Here’s a belief that I think I share with most experienced developers: For any app that is even moderately complex, you’re not smart enough to predict what the slow parts are going to be, because nobody is smart enough to predict where software bottlenecks will turn up.

So the smart way to write a fast app is to build it in the simplest way that could possibly work, avoiding egregiously-stupid thing like order-N-squared algorithms and doing I/O on the Android UI thread. Who knows, it might be fast enough, and then you’re done!

If it isn’t fast enough, don’t guess why. Measure it and find out, using a profiler. Actually I’ve been known to do this, when backed into a corner, using things like System.err.println("Entered at" + System.currentTimeMillis()); Fortunately, Android comes with a reasonably decent profiler, so you don’t have to get ugly like that.

Case Study: LifeSaver 2

I have this little utility in Android Market called LifeSaver 2, the details are on my personal blog. At one point, it reads the SMS and phone-call logs out of the system and persists them in a JSON text file on the SD card. Since this is kind of slow, it shows a nice dynamic progress bar. It occurred to me to wonder why it was kind of slow to write a few hundred records into a text file on a device that, after all, has a gigahertz processor.

Somebody who foolishly disregarded my advice above might assume that the slowdown had to be due to the ContentProvider Cursor machinery reading the system logs, or failing that, the overhead of writing to the SD card. A wiser person would instrument the code and find out. Let’s do that.

Turning On Tracing

I went into Saver.java and bracketed the code in its run() method like so:

The first call turns tracing on, the argument "lsd" (stands for Life Saver Debug, of course) tells the system to put the trace log in /sdcard/lsd.trace. Remember that doing this means you have to add the WRITE_EXTERNAL_STORAGE permission so you can save the trace info; don‘t forget to remove that before you ship.

[Update:] Android engineer Xavier Ducrohet writes to remind me: “DDMS has a start/stop profiling button in the ‘device view’. Upon clicking stop it launches TraceView with the trace file. This is not as fine grained as putting start/stopMethodTracing in your code but can be quite useful. For VMs earlier than froyo, the permission is required as well (DDMS basically automate getting the trace from the sd card and saving it locally before calling traceview). For Froyo+ VMs, the VM is able to send the trace file through the JDWP connection and the permission is not needed anymore (which is really useful).” Thanks, Xav!

Then you run your app, then you copy the output over to your computer, and fire up Traceview.

540> adb pull /sdcard/lsd.trace
541> traceview lsd

At this point, you will have noticed three things. First, turning tracing on really slows down your app. Second, the tracefile is big; in this case, 8.6M for a run that took like four seconds. Third, that traceview looks pretty cool.

The bars across the top show the app’s threads and how they dealt out the time; since the Nexus One is single-threaded CPU, they have to take turns. Let’s zero in on one 100-msec segment.

The top line is where my app code is running (the red segment is GC happening), the middle line is the UI thread and the bursts of activity are the ProgressBar updating, and I have no idea what the third thread, named HeapWorker, does, but it doesn’t seem a major contributor to the app’s runtime, so let’s ignore it.

The bottom of the screen is where the really interesting data is; it shows which of your methods burned the time, and can be sorted in a bunch of different ways. Let’s zero in on the first two lines.

Translated into English, this tells us that the top-level routine consumed 100% of the time if you include everything it called (well, yeah), but only 0.9% of the time itself. The next line suddenly starts to get real interesting: java.io.PrintStream.println(Object) and whatever it calls are using 65.2% of the app’s time. This is the code that writes the JSON out to the SD card. Right away, we know that apparently the task of pulling the data out of the phone’s ContentProviders doesn’t seem to be very expensive; it’s the output that’s hurting.

Can we conclude that the app is limited by the sluggish write performance of the SD card? Let’s drill down, which is done in the most obvious point-and-click way imaginable.

Ooh, there’s a nasty surprise. Of course, println calls (in effect) toString() on all its arguments. It looks like turning the arguments to strings is taking over half the time, before it even dispatches from println(Object) to println(String).

I’ll skip the step of drilling down into println(String) but it does suggest that yes, there is some slow I/O happening there, to the SD card. But let’s look inside that String.valueOf() call.

There’s your smoking pistol. It turns out that org.json.JSONObject.toString() is what we professional programmers call a, well, this is a family-friendly operation so I won’t go there. You can poke around inside it, but it’s just depressing.

What you can do, however, is sort all the routines by their “Exclusive” times, as in the number of CPU circles burned right there in the routine. Here are all of them that use 1% or more of the total execution time.

There’s a little bit of GC and Android framework View-wrangling stuff in there, but the display is dominated by org.jason and java.lang.StringBuilder code.

The Conclusion

The real conclusion is that in the case of this app, I actually don’t care about the performance. A user runs it a grand total of two times, once on the old phone and once on the new phone, and it’s got lots of eye candy, so I just don’t think there’s a problem.

If I did want to speed this up, it’s obvious what to do. First, either stop using JSON, or find a cheaper way to serialize it. Second, do fewer println() calls; glom the data together in one big buffer and just blast it out with a single I/O call. But, and here’s the key point, if I’d guessed where the bottlenecks were, I’d have been wrong, mostly.

Traceview is a nice tool, and if you don’t already know it, you owe it to yourself to learn it.