10 January 2012

Levels in Renderscript

For ICS, Renderscript (RS) has been updated with several new features to simplify adding compute acceleration to your application. RS is interesting for compute acceleration when you have large buffers of data on which you need to do significant processing. In this example we will look at applying a levels/saturation operation on a bitmap.

In this case, saturation is implemented by multiplying every pixel by a color matrix Levels are typically implemented with several operations.

The full code of the application is around 232 lines when you include code to compute the constants for the filter kernel, manage the controls, and display the image. On the devices I have laying around this takes about 140-180ms to process an 800x423 image.

What if that is not fast enough?

Porting the kernel of this image processing to RS (available at android-renderscript-samples) is quite simple. The pixel processing kernel above, reimplemented for RS looks like:

The first line takes the script and processes the input allocation and places the result in the output allocation. It does this by calling the natively compiled version of the script above once for each pixel in the allocation. However, unlike the dalvik implementation, the primitives will automatically launch extra threads to do the work. This, combined with the performance of native code can produce large performance gains. I’ll show the results with and without the gamma function working because it adds a lot of cost.

800x423 image

Device

Dalvik

RS

Gain

Xoom

174ms

39ms

4.5x

Galaxy Nexus

139ms

30ms

4.6x

Tegra 30 device

136ms

19ms

7.2x

800x423 image with gamma correction

Device

Dalvik

RS

Gain

Xoom

994ms

259ms

3.8x

Galaxy Nexus

787ms

213ms

3.7x

Tegra 30 device

783ms

104ms

7.5x

These large gains represent a large return on the simple coding investment shown above.