OpenCV Tutorial: Real-time Object Detection Using MSER in iOS

Over the last few years, the average mobile phone performance has increased significantly. Be it for sheer CPU horsepower or RAM capacity, it is now easier to do computation-heavy tasks on mobile hardware. Although these mobile technologies are headed in the right direction, there is still a lot to be done on mobile platforms, especially with the advent of augmented reality, virtual reality, and artificial intelligence.

A major challenge in computer vision is to detect objects of interest in images. The human eye and brain do an exceptional job , and replicating this in machines is still a dream. Over recent decades, approaches have been developed to mimic this in machines, and it is getting better.

In this tutorial, we will explore an algorithm used in detecting blobs in images. We will also use the algorithm, from the open source library, OpenCV, to implement a prototype iPhone application that uses the rear-camera to acquire images and detect objects in them.

OpenCV Tutorial

OpenCV is an open source library that provides implementations of major computer vision and machine learning algorithms. If you want to implement an application to detect faces, playing cards on a poker table, or even a simple application for adding effects on to an arbitrary image, then OpenCV is a great choice.

OpenCV is written in C/C++, and has wrapper libraries for all major platforms. This makes it especially easy to use within the iOS environment. To use it within an Objective-C iOS application, download the OpenCV iOS Framework from the official website. Please, make sure that you are using version 2.4.11 of OpenCV for iOS (which this article assumes you are using), as the lastest version, 3.0, has some compatibility-breaking changes in how the header files are organized. Detailed information on how to install it is documented on its website.

MSER

MSER, short for Maximally Stable Extremal Regions, is one of the many methods available for blob detection within images. In simple words, the algorithm identifies contiguous sets of pixels whose outer boundary pixel intensities are higher (by a given threshold) than the inner boundary pixel intensities. Such regions are said to be maximally stable if they do not change much over a varying amount of intensities.

Although a number of other blob detection algorithms exist, MSER was chosen here because it has a fairly light run-time complexity of O(n log(log(n))) where n is the total number of pixels on the image. The algorithm is also robust to blur and scale, which is advantageous when it comes to processing images acquired through real-time sources, such as the camera of a mobile phone.

For the purpose of this tutorial, we will design the application to detect the logo of Toptal. The symbol has sharp corners, and that may lead one to think about how effective corner detection algorithms may be in detecting Toptal’s logo. After all, such an algorithm is both simple to use and understand. Although corner based methods may have a high success rate when it comes to detecting objects that are distinctly separate from the background (such as black objects on white backgrounds), it would be difficult to achieve real-time detection of Toptal’s logo on real-world images, where the algorithm would be constantly detecting hundreds of corners.

Strategy

For each frame of image the application acquires through the camera, it’s converted first to grayscale. Grayscale images have only one channel of color, but the logo will be visible, nonetheless. This makes it easier for the algorithm to deal with the image and significantly reduces the amount of data the algorithm has to process for little to no extra gain.

Next, we will use OpenCV’s implementation the algorithm to extract all MSERs. Next, each MSER will be normalized by transforming its minimum bounding rectangle into a square. This step is important because the logo may be acquired from different angles and distances and this will increase tolerance of perspective distortion.

Furthermore, a number of properties are computed for each MSER:

Number of holes

Ratio of the area of MSER to the area of its convex hull

Ratio of the area of MSER to the area of its minimum-area rectangle

Ratio of the length of MSER skeleton to area of the MSER

Ratio of the area of MSER to the area of its biggest contour

In order to detect Toptal’s logo in an image, properties of the all the MSERs are compared to already learned Toptal logo properties. For the purpose of this tutorial, maximum allowed differences for each property were chosen empirically.

Finally, the most similar region is chosen as the result.

iOS Application

Using OpenCV from iOS is easy. If you haven’t done it yet, here is a quick outline of the steps involved in setting up Xcode to create an iOS application and use OpenCV in it:

Create a new project name “SuperCool Logo Detector.” As the language, leave Objective-C selected.

Add a new Prefix Header (.pch) file and name it PrefixHeader.pch

Go into project “SuperCool Logo Detector” Build Target and in the Build Settings tab, find the “Prefix Headers” setting. You can find it in the LLVM Language section, or use the search feature.

Add “PrefixHeader.pch” to Prefix Headers setting

At this point, if you haven’t installed OpenCV for iOS 2.4.11, do it now.

Drag-and-drop the downloaded framework into the project. Check “Linked Frameworks and Libraries” in your Target Settings. (It should be added automatically, but better to be safe.)

Additionally, link the following frameworks:

AVFoundation

AssetsLibrary

CoreMedia

Open “PrefixHeader.pch” and add the following 3 lines:

#ifdef __cplusplus
#include <opencv2/opencv.hpp>
#endif”

Change extensions of automatically created code files from “.m” to “.mm”. OpenCV is written in C++ and with *.mm you are saying that you will be using Objective-C++.

Import “opencv2/highgui/cap_ios.h” in ViewController.h and change ViewController to conform with the protocol CvVideoCameraDelegate:

#import <opencv2/highgui/cap_ios.h>

Open Main.storyboard and put an UIImageView on the initial view controller.

Make an outlet to ViewController.mm named “imageView”

Create a variable “CvVideoCamera *camera;” in ViewController.h or ViewController.mm, and initialize it with a reference to the rear-camera:

If you build the project now, Xcode will warn you that you didn’t implement the “processImage” method from CvVideoCameraDelegate. For now, and for the sake of simplicity, we will just acquire the images from the camera and overlay them with a simple text:

Add a single line to “viewDidAppear”:

[camera start];

Now, if you run the application, it will ask you for permission to access the camera. And then you should see video from camera.

That is pretty much it. Now you have a very simple application that draws the text “Toptal” on images from camera. We can now build our target logo detecting application off this simpler one. For brevity, in this article we will discuss only a handful of code segments that are critical to understanding how the application works, overall. The code on GitHub has a fair amount of comments to explain what each segment does.

Since the application has only one purpose, to detect Toptal’s logo, as soon as it is launched, MSER features are extracted from the given template image and the values are stored in memory:

The application has only one screen with a Start/Stop button, and all necessary information, as FPS and number of detected MSERs, are drawn automatically on the image. As long as the application is not stopped, for every image frame in the camera, the following processImage method is invoked:

This method, in essence, creates a grayscale copy of the original image. It identifies all MSERs and extracts their relevant features, scores each MSER for similarity with the template and picks the best one. Finally, it draws a green boundary around the best MSER and overlays the image with meta information.

Below are the definitions of a few important classes, and their methods, in this application. Their purposes are described within comments.

Conclusion

In this article we have shown how easy it is to detect simple objects from an image using OpenCV. The entire code is available on GitHub. Feel free to fork and send push requests, as contributions are welcome.

As is true for any machine learning problems, the success rate of the logo detection in this application may be increased by using a different set of features and different method for object classification. However, I hope that this article will help you get started with object detection using MSER and applications of computer vision techniques, in general.

About the author

Altaibayar is a full-stack developer with two years of professional experience, but his talents don't stop there. Beginning with J2ME and Windows Phone and moving on to Android and iOS, his hobby since high school has been developing independently for mobile platforms. He is an enthusiastic engineer and a responsible team member. [click to continue...]

Comments

meadlai

hi, I have builded your app, it works nice, thanks.
but there are some questions, where is the image named "toptal logo", I can't locate it.
[[MLManager sharedInstance] learn: [UIImage imageNamed: @"toptal logo"]];

Waqas Khalid Obeidy

I think toptal logo is located in Images.xcassets folder.

Haresh Kainth

[SOLUTION FOUND] - Don't set the target to 9.0 in your Xcode project. Set the target to 7.1.
Hi, Ive downloaded your project and was able to build and install it on my iPhone. But it keeps crashing. Following is the last error in the debugger:
OpenCV Error: Assertion failed (points.checkVector(2) >= 0 && (points.depth() == CV_32F || points.depth() == CV_32S)) in minAreaRect, file /Users/build/test/opencv/modules/imgproc/src/contours.cpp, line 1913
libc++abi.dylib: terminating with uncaught exception of type cv::Exception: /Users/build/test/opencv/modules/imgproc/src/contours.cpp:1913: error: (-215) points.checkVector(2) >= 0 && (points.depth() == CV_32F || points.depth() == CV_32S) in function minAreaRect
Im not quite sure why this is happening? Any ideas?

Rashmi Ranjan

This post describes how to recognize the hard-coded toptal logo. How to make it generic so that it can detect other faces also? Is there anyway? I am finding it tough.

M.Lo

hello, it can not work to my phone, Xcode show 'linker command failed with exit code 1 (use -v to see invocation)' how to fix this problem? thanks

Fabio Panc

Hi! really interesting post. I am trying to develop an Android app that detects all the boxes on a shelf.
Do you think that an approach like yours is good? for now using line detection or contours detection fails miserbly because of the content's of the boxes's covers (like colorful logos, etc) that f*cks up Canny phase (I use median*0.66 and median*1.33 as thresholds for canny, so having "disturbing" logos and text on each box avoid a correct detection). I'm thinking of collect a list of images representing my boxes (different types, like milk bottle, like cereals box, etc etc) and, using one of the object detection algorithms provided by openCV , find my boxes.

shrey shrivastava

I'm getting error
~/MSERManager.mm:30:5: No type named 'MserFeatureDetector' in namespace 'cv' on line cv::MserFeatureDetector mserDetector;
Does anyone else getting this error?

shrey shrivastava

Just FYI,
#import <opencv2/highgui/cap_ios.h> has been changed to
#import <opencv2/videoio/cap_ios.h> in the latest version of OpenCV

Vamshi Krishna

"In order to detect Toptal’s logo in an image, properties of the all the MSERs are compared to already learned Toptal logo properties". In which class are Toptal logo properties defined?

kiran kumar G

This post describes how to recognize the hard-coded toptal logo. How to make it generic so that it can detect other faces also? Is there anyway to use our images and use this project

Altaibayar is a full-stack developer with two years of professional experience, but his talents don't stop there. Beginning with J2ME and Windows Phone and moving on to Android and iOS, his hobby since high school has been developing independently for mobile platforms. He is an enthusiastic engineer and a responsible team member.