Homography Examples using OpenCV ( Python / C ++ )

The Tower of Babel, according to a mythical tale in the Bible, was humans’ first engineering disaster. The project had all the great qualities of having a clear mission, lots of man power, no time constraint and adequate technology ( bricks and mortar ). Yet it failed spectacularly because God confused the language of the human workers and they could not communicate any longer.

Terms like “Homography” often remind me how we still struggle with communication. Homography is a simple concept with a weird name!

What is Homography ?

Consider two images of a plane (top of the book) shown in Figure 1. The red dot represents the same physical point in the two images. In computer vision jargon we call these corresponding points. Figure 1. shows four corresponding points in four different colors — red, green, yellow and orange. A Homography is a transformation ( a 3×3 matrix ) that maps the points in one image to the corresponding points in the other image.

Figure 1 : Two images of a 3D plane ( top of the book ) are related by a Homography

Now since a homography is a 3×3 matrix we can write it as

Let us consider the first set of corresponding points — in the first image and in the second image. Then, the Homography maps them in the following way

Image Alignment Using Homography

The above equation is true for ALL sets of corresponding points as long as they lie on the same plane in the real world. In other words you can apply the homography to the first image and the book in the first image will get aligned with the book in the second image! See Figure 2.

Figure 2 : One image of a 3D plane can be aligned with another image of the same plane using Homography

But what about points that are not on the plane ? Well, they will NOT be aligned by a homography as you can see in Figure 2. But wait, what if there are two planes in the image ? Well, then you have two homographies — one for each plane.

Panorama : An Application of Homography

In the previous section, we learned that if a homography between two images is known, we can warp one image onto the other. However, there was one big caveat. The images had to contain a plane ( the top of a book ), and only the planar part was aligned properly. It turns out that if you take a picture of any scene ( not just a plane ) and then take a second picture by rotating the camera, the two images are related by a homography! In other words you can mount your camera on a tripod and take a picture. Next, pan it about the vertical axis and take another picture. The two images you just took of a completely arbitrary 3D scene are related by a homography. The two images will share some common regions that can be aligned and stitched and bingo you have a panorama of two images. Is it really that easy ? Nope! (sorry to disappoint) A lot more goes into creating a good panorama, but the basic principle is to align using a homography and stitch intelligently so that you do not see the seams. Creating panoramas will definitely be part of a future post.

How to calculate a Homography ?

To calculate a homography between two images, you need to know at least 4 point correspondences between the two images. If you have more than 4 corresponding points, it is even better. OpenCV will robustly estimate a homography that best fits all corresponding points. Usually, these point correspondences are found automatically by matching features like SIFT or SURF between the images, but in this post we are simply going to click the points by hand.

Let’s look at the usage first.

C++

// pts_src and pts_dst are vectors of points in source
// and destination images. They are of type vector<Point2f>.
// We need at least 4 corresponding points.
Mat h = findHomography(pts_src, pts_dst);
// The calculated homography can be used to warp
// the source image to destination. im_src and im_dst are
// of type Mat. Size is the size (width,height) of im_dst.
warpPerspective(im_src, im_dst, h, size);

Python

'''
pts_src and pts_dst are numpy arrays of points
in source and destination images. We need at least
4 corresponding points.
'''
h, status = cv2.findHomography(pts_src, pts_dst)
'''
The calculated homography can be used to warp
the source image to destination. Size is the
size (width,height) of im_dst
'''
im_dst = cv2.warpPerspective(im_src, h, size)

Let us look at a more complete example in both C++ and Python.

OpenCV C++ Homography Example

Images in Figure 2. can be generated using the following C++ code. The code below shows how to take four corresponding points in two images and warp image onto the other.

Applications of Homography

The most interesting application of Homography is undoubtedly making panoramas ( a.k.a image mosaicing and image stitching ). Panoramas will be the subject of a later post. Let us see some other interesting applications.

Perspective Correction using Homography

Figure 3. Perspective Correction

Let’s say you have a photo shown in Figure 1. Wouldn’t it be cool if you could click on the four corners of the book and quickly get an image that looks like the one shown in Figure 3. You can get the code for this example in the download section below. Here are the steps.

Write a user interface to collect four corners of the book. Let’s call these points pts_src

We need to know the aspect ratio of the book. For this book, the aspect ratio ( width / height ) is 3/4. So we can choose the output image size to be 300×400, and our destination points ( pts_dst ) to be (0,0), (299,0), (299,399) and (0,399)

Obtain the homography using pts_src and pts_dst .

Apply the homography to the source image to obtain the image in Figure 3.

You can download the code and images used in this post by subscribing to our newsletter here.

Virtual Billboard

In many televised sports events, advertisement in virtually inserted in live video feed. E.g. in soccer and baseball the ads placed on small advertisement boards right outside the boundary of the field can be virtually changed. Instead of displaying the same ad to everybody, advertisers can choose which ads to show based on the person’s demographics, location etc. In these applications the four corners of the advertisement board are detected in the video which serve as the destination points. The four corners of the ad serve as the source points. A homography is calculated based on these four corresponding points and it is used to warp the ad into the video frame.

After reading this post you probably have an idea on how to put an image on a virtual billboard. Figure 4. shows the first image uploaded to the internet.

Figure 4. First image uploaded to the internet.

And Figure 5. shows The Times Square.

Figure 5. The Times Square

You can download the code (C++ & Python) and images used in this example and other examples in this post by subscribing to our newsletter here.

We can replace one of the billboards on The Times Square with the image of our choice. Here are the steps.

Write a user interface to collect the four corners of the billboard in the image. Let’s call these points pts_dst

Let the size of the image you want to put on the virtual billboard be w x h. The corners of the image ( pts_src ) are therefore to be (0,0), (w-1,0), (w-1,h-1) and (0,h-1)

Obtain the homography using pts_src and pts_dst .

Apply the homography to the source image and blend it with the destination image to obtain the image in Figure 6.

Notice in Figure 6. we have inserted image shown in Figure 4. into The Times Square Image.

Figure 6. Virtual Billboard. One of the billboards on the left hand side has been replaced with an image of our choice.

Subscribe & Download Code

If you liked this article and would like to download code (C++ and Python) and example images used in this post, please subscribe to our newsletter. You will also receive a free Computer Vision Resource guide. In our newsletter we share OpenCV tutorials and examples written in C++/Python, and Computer Vision and Machine Learning algorithms and news.

It is my favorite computer vision book. One of my minor regrets is that I turned down an offer to do an internship with Dr. Sing Bing Kang and Dr. Szeliski when I was a grad student. I used that summer to start a company instead :).

Hi. Thanks for the information. Is there any good order to see the all posts of this blog? I’m a kind of newbie, trying to learn computer vision (I’ve just understanded MLP) Am I allowed to know how the people do image net comp’, while I learn computer vision?

Hi Satya, I would like to thank you for this sample, and i appreciate the way to explain homography. But i have one remark to add. The warped image is equal to the transformation of the source image to the destination by the homography matrix h. So to obtain that in C++, we should use the inverse of the matrix h in warpPerspective() function

Hi sathya, when i am using the Homography and wrapPerspective for a video , the output video has a lot of flickering according to the camera angle. I think this technique is used for only static camera. Any ideas to eliminate the flickering. Really appreciate it.

Hi Satya, I used the above technique on the first image and aligned it with the second one. The aligned image is shown in the 3rd image with black areas. I need to compare the two images and show the difference (bitwise_xor or sub). One issue is part of the image is missing as it has negative coordinates after rotation. But if I put an offset to bring it into visible area, it will no longer be aligned. So what’s the best way to compare and show the difference? Any pointers?

I am a beginner in computer vision. I am working on a dataset in which due to camera inclined position, there is perspective effect. I calculate optical flow, however, due to perspective distortion, optical flow is not correct. I attach figure here. Please see that optical flow for people near to camera is correct, whereas, people far from camera, optical flow is not calculated. I assume that this is due to the perspective distortion. Please suggest me that can I apply homography method explain in this post? Secondly do I need to apply this process before calculating optical flow?

I’ve found the https://dl.dropboxusercontent.com/u/710615/3.jpg of the “source image” . I think that I need to remove a column of white pixels from the upper left corner of the mesh (I don’t know yet how to do it easily). Besides The mesh image and the original image from the mesh are equal. However, the image of layer has some similar holes but not equal, which is acceptable.

Sir, can there be any homography between world plane and corresponding image Plane. If yes, then would be the coordinates in world dimension, will they be like (0, 0) (0, w) (w, h) (h, 0) where w and h are width and height of rectangular plane in real dimensions(mm/cm). Wouldn’t there be any inconsistency since image coordinates are in pixels.

Hello Satya, Is there any way to reduce the opacity of the im_dst image in the im_out image, so that I can see both im_src and im_dst images in im_out image? I tried adding an alpha channel to im_dst image just before using cv2.warpPerspective(), but all I got was a darkened im_dst image in im_out image! How to see both the images at the same time?

When an homography transforms pixel locations, it transforms them to homogenous coordinates, but they may be scaled by some scaling factor; you must divide by the scaling factor to get back to the correct coordinates in your image. So in your example, it should be [s*x’, s*y’, s] = H * [x, y, 1], and a division by s would give you the points [x’, y’, 1].

I have a small question. Generally Homography matrix is a camera projection matrix when the 3d scene lies on a 2d plane(for example on z==0). In that case, we can use the camera projection equations to find this H matrix. But how did we come up with this equation x’ = H*x. Can you give an explanation?

Join Course

Resources

Disclaimer

This site is not affiliated with OpenCV.org

I am an entrepreneur with a love for Computer Vision and Machine Learning with a dozen years of experience (and a Ph.D.) in the field.

In 2007, right after finishing my Ph.D., I co-founded TAAZ Inc. with my advisor Dr. David Kriegman and Kevin Barnes. The scalability, and robustness of our computer vision and machine learning algorithms have been put to rigorous test by more than 100M users who have tried our products. Read More…