GroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupGroupSkip to Content

Image Processing in iOS Part 1: Raw Bitmap Modification

Version

Other, Other, Other

Imagine you just took the best selfie of your life. It’s spectacular, it’s magnificent and worthy your upcoming feature in Wired. You’re going to get thousands of likes, up-votes, karma and re-tweets, because you’re absolutely fabulous. Now if only you could do something to this photo to shoot it through the stratosphere…

That’s what image processing is all about! With image processing, you can apply fancy effects to photos such as modifying colors, blending other images on top, and much more.

In this two-part tutorial series, you’re first going to get a basic understanding of image processing. Then, you’ll make a simple app that implements a “spooky image filter” and makes use of four popular image processing methods:

Raw bitmap modification

Using the Core Graphics Library

Using the Core Image Library

Using the 3rd-party GPUImage library

In this first segment of this image processing tutorial, you’ll focus on raw bitmap modification. Once you understand this basic process, you’ll be able to understand what happens with other frameworks. In the second part of the series, you’ll learn about three other methods to make your selfie, and other images, look remarkable.

This tutorial assumes you have basic knowledge of iOS and Objective-C, but you don’t need any previous image processing knowledge.

Getting Started

Before you start coding, it’s important to understand several concepts that relate to image processing. So, sit back, relax and soak up this brief and painless discussion about the inner workings of images.

First things first, meet your new friend who will join you through tutorial…………drumroll……Ghosty!

Boo!

Now, don’t be afraid, Ghosty isn’t a real ghost. In fact, he’s an image. When you break him down, he’s really just a bunch of ones and zeroes. That’s far less frightening than working with an undead subject.

What’s an Image?

An image is a collection of pixels, and each one is assigned a single, specific color. Images are usually arranged as arrays and you can picture them as 2-dimensional arrays.

Here is a much smaller version of Ghosty, enlarged:

The little “squares” in the image are pixels, and each one shows only one color. When hundreds and thousands of pixels come together, they create a digital image.

How are Colors Represented in Bytes?

There are numerous ways to represent a color. The method that you’re going to use in this tutorial is probably the easiest to grasp: 32-bit RGBA.

As the name entails, 32-bit RGBA stores a color as 32 bits, or 4 bytes. Each byte stores a component, or channel. The four channels are:

R for red

G for green

B for blue

A for alpha.

As you probably already know, red, green and blue are a set of primary colors for digital formats. You can create almost any color you want from mixing them the right way.

Since you’re using 8-bits for each channel, the total amount of opaque colors you can actually create by using different RGB values in 32-bit RGBA is 256 * 256 * 256, which is approximately 17 million colors. Whoa man, that’s a lot of color!

The alpha channel is quite different from the others. You can think of it as transparency, just like the alpha property of UIView.

The alpha of a color doesn’t really mean anything unless there’s a color behind it; its main job is to tell the graphics processor how transparent the pixel is, and thus, how much of the color beneath it should show through.

You’ll get to dive into depth when you work through the section on blending.

To conclude this section, an image is a collection of pixels, and each pixel is encoded to display a single color. For this lesson, you’ll work with 32-bit RGBA.

Note: Have you ever wondered where the term Bitmap originated? A bitmap is a 2D map of pixels, each one comprised of bits! It’s literally a map of bits. Ah-ha!

So, now you know the basics of representing colors in bytes. There are still a three more concepts to cover before you dig in and start coding.

Color Spaces

The RGB method to represent colors is an example of a colorspace. It’s one of many methods that stores colors. Another colorspace is grayscale.

As the name entails, all images in the grayscale colorspace are black and white, and you only need to save one value to describe its color.

The downside of RGB is that it’s not very intuitive for humans to visualize.

Red: 0 Green:104 Blue:55

For example, what color do you think an RGB of [0, 104, 55] produces?

Taking an educated guess, you might say a teal or skyblue-ish color, which is completely wrong. Turns out it’s the dark green you see on this website!

Two other more popular color spaces are HSV and YUV.

HSV, which stand for Hue, Saturation and Value, is a much more intuitive way to describe colors. You can think of the parts this way:

Hue as “Color”

Saturation as “How full is this color”

Value, as the “Brightness”

In this color space, if you found yourself looking at unknown HSV values, it’s much easier to imagine what the color looks like based on the values.

The difference between RGB and HSV are pretty easy to understand, at least once you look at this image:

Image modified from work by SharkD under the Creative Commons Attribution-Share Alike 3.0 Unported license

YUV is another popular color space, because it’s what TVs use.

Television signals came into the world with one channel, Grayscale. Later, two more channels “came into the picture” when color film emerged. Since you’re not going to tinker with YUV in this tutorial, you might want to do some more research on YUV and other color spaces to round out your knowledge. :]

Note: For the same color space, you can still have different representations for colors. One example is 16-bit RGB, which optimizes memory use by using 5 bits for R, 6 bits for G, and 5 bits for B.

Why 6 for green, and 5 for red and blue? This is an interesting question and the answer comes from your eyeball. Human eyes are most sensitive to green and so an extra bit enables us to move more finely between different shades of green.

Coordinate Systems

Since an image is a 2D map of pixels, the origin needs specification. Usually it’s the top-left corner of the image, with the y-axis pointing downwards, or at the bottom left, with the y-axis pointing upwards.

There’s no “correct” coordinate system, and Apple uses both in different places.

Currently, UIImage and UIView use the top-left corner as the origin and Core Image and Core Graphics use the bottom-left. This is important to remember so you know where to find the bug when Core Image returns an “upside down” image.

Image Compression

This is the last concept to discuss before coding! With raw images, each pixel is stored individually in memory.

If you do the math on an 8 megapixel image, it would take 8 * 10^6 pixels * 4 bytes/pixel = 32 Megabytes to store! Talk about a data hog!

This is where JPEG, PNG and other image formats come into play. These are compression formats for images.

When GPUs render images, they decompress images to their original size, which can take a lot of memory. If your app takes up too much memory, it could be terminated by the OS (which looks to the user like a crash). So be sure to test your app with large images!

I’m dying for some action

Looking at Pixels

Now that you have a basic understanding of the inner workings of images, you’re ready to dive into coding. Today you’re going to work through developing a selfie-revolutionizing app called SpookCam, the app that puts a little Ghosty in your selfie!

Currently the app is loading the tiny version of Ghosty from the bundle, converting it into a pixel buffer and printing out the brightness of each pixel to the log.

What’s the brightness? It’s simply the average of the red, green and blue components.

Pretty neat. Notice how the outer pixels have a brightness of 0, which means they should be black. However, since their alpha value is 0, they are actually transparent. To verify this, try setting imageView background color to red, then build and run gain.

Now take a quick glance through the code. You’ll notice ViewController.m uses UIImagePickerController to pick images from the album or to take pictures with the camera.

After it selects an image, it calls -setupWithImage:. In this case, it outputs the brightness of each pixel to the log. Locate logPixelsOfImage: inside of ViewController.m, and review the first part of the method:

Section 1: Convert the UIImage to a CGImage object, which is needed for the Core Graphics calls. Also, get the image’s width and height.

Section 2: For the 32-bit RGBA color space you’re working in, you hardcode the parameters bytesPerPixel and bitsPerComponent, then calculate bytesPerRow of the image. Finally, you allocate an array pixels to store the pixel data.

Section 3: Create an RGB CGColorSpace and a CGBitmapContext, passing in the pixels pointer as the buffer to store the pixel data this context holds. You’ll explore Core Graphics in more depth in a section below.

Section 4: Draw the input image into the context. This populates pixels with the pixel data of image in the format you specified when creating context.

Section 5: Cleanup colorSpace and context.

Note: When you display an image, the device’s GPU decodes the encoding to display it on the screen. To access the data locally, you need to obtain a copy of the pixels, just like you’re doing here.

At this point, pixels holds the raw pixel data of image. The next few lines iterate through pixels and print out the brightness:

Define some macros to simplify the task of working with 32-bit pixels. To access the red component, you mask out the first 8 bits. To access the others, you perform a bit-shift and then a mask.

Get a pointer of the first pixel and start 2 for loops to iterate through the pixels. This could also be done with a single for loop iterating from 0 to width * height, but it's easier to reason about an image that has two dimensions.

Get the color of the current pixel by dereferencing currentPixel and log the brightness of the pixel.

Increment currentPixel to move on to the next pixel. If you're rusty on pointer arithmetic, just remember this: Since currentPixel is a pointer to UInt32, when you add 1 to the pointer, it moves forward by 4 bytes (32-bits), to bring you to the next pixel.

Note: An alternative to the last method is to declare currentPixel as a pointer to an 8-bit type (ie char). This way each time you increment, you move to the next component of the image. By dereferencing it, you get the 8-bit value of that component.

At this point, the starter project is simply logging raw image data, but not modifying anything yet. That's your job for the rest of the tutorial!

SpookCam - Raw Bitmap Modification

Of the four methods explored in this series, you'll spend the most time on this one because it covers the "first principles" of image processing. Mastering this method will allow you to understand what all the other libraries do.

In this method, you'll loop through each pixel, as the starter kit already does, but this time assign new values to each pixel.

This advantage of this method is that it's easy to implement and understand; the disadvantage is that scaling to larger images and effects that are more complicated is less than elegant.

As you see in the starter app, the ImageProcessor class already exists. Hook it up to the main ViewController by replacing -setupWithImage: with the following code in ViewController.m:

Now take a look at ImageProcessor.m. As you can see, ImageProcessor is a singleton object that calls -processUsingPixels: on an input image, then returns the output through the ImageProcessorDelegate.

-processUsingPixels: is currently a copy of the code you looked at previously that gives you access to the pixels of inputImage. Notice the two extra macros A(x) and RGBAMake(r,g,b,a) that are defined to provide convenience.

Now build and run. Choose an image from your album (or take a photo) and you should see it appear in your view like this:

That looks way too relaxing, time to bring in Ghosty!

Before the return statement in processUsingPixels:, add the following code to get an CGImageRef of Ghosty:

Notice how you only loop through the number of pixels in Ghosty's image, and offset the input image by offsetPixelCountForInput. Remember that although you're reasoning about images as 2-D arrays, in memory they are actually 1-D arrays.

Next, fill in this code after the comment Do some processing here to do the actual blending:

You apply 50% alpha to Ghosty by multiplying the alpha of each pixel by 0.5. You then blend with the alpha blend formula previously discussed.

The clamping of each color to [0,255] is not required here, since the value will never go out of bounds. However, most algorithms require this clamping to prevent colors from overflowing and giving unexpected outputs.

To test this code, add this code to the bottom of processUsingPixels:, replacing the current return statement:

This creates a new UIImage from the context and returns it. You're going to ignore the potential memory leak here for now.

Build and run. You should see Ghosty floating in your image like, well, a ghost:

Good work so far, this app is going viral for sure!

Black and White

One last effect to go. Try implementing the black and white filter yourself. To do this, set each pixel's red, green and blue components to the average of the three channels in the original, just like how you printed out Ghosty's brightness in the beginning.

Write this code before the // Create a new UIImage comment you added in the previous step.

Where To Go From Here?

Congratulations! You just finished your first image-processing application. You can download a working version of the project at this point here.

That wasn't too hard, right? You can play around with the code inside the for loops to create your own effects, try to see if you can implement these ones:

Swap the red and blue channels of the image

Increase the brightness of the image by 10%

As a further challenge, try scaling Ghosty using only pixel-based methods. Here are the steps:

Create a new CGContext with the target size for Ghosty.

For each pixel in this new Context, calculate which pixel you should copy from in the original image.

For extra coolness, try interpolating between nearby pixels if your calculations for the original coordinate lands in-between pixels. If you interpolate between the four nearest pixels, you have just implemented Bilinear scaling all on your own! What a boss!

If you've completed the first project, you should have a pretty good grasp on the basic concepts of image processing. Now you can set out and explore simpler and faster ways to accomplish these same effects.

In the next part of the series, you replace -processUsingPixels: with three new functions that will perform the same task using different libraries. Definitely check it out!

In the meantime, if you have any questions or comments about the series so far, please join the forum discussion below!