Abstract
Vanishing point detection has many useful applications such as camera calibration, autonomous vehicle navigation, 3D reconstruction, object recognition and so on. In this project,
we have explored two state-of-the-art vanishing point detection algorithms proposed by Rother
[1] and Tardif [2]. The algorithms have been implemented from the bottom up and they have
been used to detect three orthogonal vanishing points in indoor scenes. We have also compared
the two algorithms through both theoretical analysis and extensive experiments. Finally, we
have proposed a hybrid method which combines both the algorithms for increased efficiency.
The hybrid method is capable of detecting three orthogonal vanishing points efficiently without calibrating the camera. Results of our implementation and comparison of the methods are
shown. We have also explored the possibility of using the three orthogonal vanishing points for
camera calibration and finding the pose of objects in the scene.

1

Introduction

In projective formation of images, two parallel lines in the 3D world intersect at a single point in the
image plane, which is called the vanishing point. In Figure 1, the parallel railway tracks intersect
at a single point in the image, which is the vanishing point of the tracks. Vanishing points encode
much information about both the camera and the real world. As a result, vanishing point detection
is an important problem in computer vision with applications in camera calibration, autonomous
vehicle navigation, 3D reconstruction, object recognition and so on.
Vanishing points are the result of the transformation that projects 3D points onto the 2D image
plane. This projective transformation does not preserve parallelism and hence vanishing points are
formed. The equation of the vanishing point, assuming perfect projection, is given by v = Kd,
where v is the coordinate of vanishing point in the image plane, K is the camera internals matrix
and d is the direction of the line in the real world [3].
There are numerous algorithms for vanishing point detection, each targeted at a specific application. For instance, the method proposed by Rother [1] is targeted at building reconstruction and
uses a computationally intensive approach. Other algorithms like the one proposed by Tardif [2]
are computationally less intensive. Thus the choice of algorithm depends largely on the application
in mind. Some of the factors considered while choosing an algorithm include robustness, accuracy,
computational efficiency and optimization technique used.
In this project we demonstrate the detection of vanishing points in uncalibrated images of
indoor scenes using two state-of-the-art algorithms. The case of images from uncalibrated cameras
1

(a) The railway tracks appear to converge to a point

(b) The convergence point is called the
vanishing point (red dot)

Figure 1: Illustration of a vanishing point
is taken because in many situations calibration information is not available. Only indoor scenes are
considered because, in general, indoor scenes are dominated by parallel lines (e.g.: roof, wall, tables,
etc.) which provide much information for vanishing point detection. We have implemented the two
algorithms from the bottom up and we have compared their results. We have also combined the
best features of the two algorithms into a hybrid method. Finally, we have explored the possibility
of using the vanishing points detected by this hybrid method to estimate the pose of objects in an
indoor scene.

2

Review of Previous Work

Vanishing point detection in images has been an active area of research in computer vision for quite
a while [4; 5] with one of the oldest papers being published over 25 years ago [6]. A majority of
the vanishing point detection algorithms are divided into two steps viz. line detection and model
estimation. In the line detection step edges in the image that correspond to straight lines are
extracted. Line fitting is performed to detect only lines and discard other edges. An important
problem here is that due to errors in the image projection (like lens distortion) lines in the real
world may not be imaged as straight lines.
In the model estimation step the detected lines are considered as a whole to estimate the vanishing points corresponding to parallel lines. This process is computationally intensive and several
optimization techniques have been proposed for this. One approach is to map the unbounded image plane to a bounded space called the accumulator space and then to compute intersections on
this space. An example of an accumulator space is the Gaussian sphere. The intersections of the
detected lines are considered as candidate vanishing points. Other approaches do not employ an accumulator space but work on the detected lines directly and simultaneously estimate the vanishing
points. Examples of the second include the method proposed by Kogecka and Zhang [4] that uses
the Expectation Maximization (EM) algorithm and a non-iterative technique proposed by Tardif
[2].
For this project two recent algorithms were implemented [1; 2], each employing a different
technique in the model estimation step. Some parts of the algorithms were modified so as to suit
our requirements. In the following sections the two algorithms are explained with notes where
modifications were made.

2

3
3.1

Technical Details
Rother’s Algorithm

The first algorithm implemented was proposed by Rother [1] to detect three orthogonal vanishing
points in architectural environments. The algorithm consists of two steps viz. the accumulation
step and the search step. Compared with methods which use a Gaussian sphere as the accumulation
space, Rother’s algorithm uses an unbound image plane as the accumulation space. First, given an
image, line segments are detected in the image using the method proposed by Kosecka and Zhang
[7]. Then the intersection points, perhaps at infinity, of all pairs of non-collinear line segments
are considered as potential vanishing points. A vote value is calculated for each vanishing point
candidate based on the relationships between the line segments and the candidate. Finally, three
orthogonal vanishing points with the maximal vote are selected from the vanishing point candidates
in the search step.
To calculate the vote of a potential vanishing point, we need to define a compatibility measure
between a line segment and a vanishing point. We adopt the exponential voting scheme proposed in
[8], which is more discriminative between good and bad vanishing point candidates than the original
voting scheme used by Rother. For each detected line segment s in the image, we define its vote for
a vanishing point p as
γ
v(s, p) = |s| × exp(− 2 ),
(1)
2σ
where |s| denotes the length of line segment s, γ is the angle between line segment s and the line
connecting p and midpoint of s, and σ is a robustness threshold. Then we say that line segment
s votes for vanishing point p if and only if γ < tγ , where tγ is a predefined constant. The vote of
vanishing point p is given by
X
vote(p) =
v(s, p).
(2)
s vote for p

Finally, the orthogonal vanishing point triplet candidates with the maximal sum of vote values
is selected. Rother proposes three criteria to check the orthogonality of three vanishing point
candidates viz. the orthogonal criterion, the camera criterion and the vanishing line criterion. If,
and only if, the three criteria are satisfied by the three vanishing point candidates will they be
considered in the search step.

3.2

Tardif ’s Algorithm

The second algorithm implemented was proposed by Tardif [2] for estimating vanishing points in
man-made environments. As opposed to most algorithms, Tardif’s algorithm uses a non-iterative
approach thus increasing the computational efficiency. It is based on a multiple model estimation
technique called J-Linkage [9]. Tardif’s method consists of two steps as before viz. the accumulation
step and the search step. Similar to Rother’s algorithm the accumulation takes place in the image
plane. For line segment detection Tardif suggests a technique based on the Canny edge detector.
But in our experiments we found that Kosecka’s method [7] worked much better. The line segments
are then clustered together based on the J-Linkage algorithm which is explained in the following
section. In the search step the vanishing point corresponding to each of the detected clusters is
found using a least squares approach. The first three clusters with the most number of candidate
line segments are chosen as the three dominant vanishing points.

3

3.2.1

J-Linkage

Tardif’s algorithm uses J-Linkage for clustering line segments that correspond to the same vanishing
point. J-Linkage is very similar in principle to RANSAC. The difference here is that RANSAC is
used when we want to fit data to one model whereas J-Linkage can fit data to multiple models. In
the present case the models are the vanishing points.
The first step in the agglomerative clustering is to choose M minimal sets of two line segments
each that are assumed to correspond to M vanishing points. Then a preference matrix is created
with number of rows equal to the line segments and M columns. This matrix stores the vote of every
line segment to every other of the M models in a boolean form. Each row of the preference matrix
is treated as a cluster from now. Next, a distance metric is used to find the distances between all
the clusters. The distance metric used is the Jaccard distance, dj (A, B) which is given as
dj (A, B) =

|A ∪ B| − |A ∩ B|
,
|A ∪ B|

(3)

where A and B are clusters. Two clusters with the minimum distance are merged together into
a single cluster. This process is repeated until the the distance between all the clusters is 1. At the
end of the clustering there are typically 3-7 clusters out of which the top three are chosen to find
the dominant vanishing points.

3.3

The Hybrid Method

Although Tardif’s algorithm is much faster computationally than Rother’s, it has the drawback
that orthogonal vanishing points are not detected, only dominant vanishing points are. Therefore
we have devised a hybrid method that combines these two algorithms.
In our approach, edges are detected as explained earlier. Subsequently, Tardif’s clustering
technique is used to find the clusters corresponding to the dominant vanishing points. Finally, the
three criteria given by Rother are applied on the dominant vanishing points to get the three most
orthogonal. This hybrid method is thus able to detect the three orthogonal vanishing points without
the need for camera calibration.

3.4

Using Vanishing Points

The three orthogonal vanishing points obtained from the above methods can be used for numerous
purposes including estimating the camera matrix, K. Another application is to find the pose of
objects in the scene. Figure 2 shows the line membership corresponding to the three orthogonal
vanishing points for an image containing a common object like a chair. Once K and the rotation
matrix R are found between the camera and the world reference frames, we can compute the pose
of the chair. This has not been attempted for this project and is a possible future direction.

4
4.1

Experimental Results
Vanishing Points

The results of Rother’s algorithm and Tardif’s algorithm from our implementation are shown in
this section. Both the algorithms were used to detect three orthogonal vanishing points in indoor

4

Figure 2: Line membership for a common object like a chair

VP 1
VP 2
VP 3

Rother’s Algorithm
459.14
2687.44
-264.14
-33.06
1914.77
-19.57

Tardif ’s Algorithm
469.08
2740.82
-254.13
16.17
2033.93
-51.44

Table 1: Numerical coordinates of the three orthogonal vanishing points for the box image
images. Various images of indoor scenes obtained from the WWW were used to test our implementation. Figure 3 displays the vanishing point detection results for two indoor images using
Rother’s algorithm. The vertical, horizontal left and horizontal right vanishing points are displayed
in distinct colors. In each image, the orthocenter of the triangle defined by three vanishing lines
is the principal point. Figure 4 shows the same results for Tardif’s method. For a given image if
the lines along a particular direction are almost parallel the vanishing points will be very far in the
image plane. In such cases visualization using the triangle scheme is not possible.
It is clear that, for the given images, Rother’s algorithm produces a principal point that is more
accurate than Tardif’s algorithm. The numerical coordinates for the vanishing points are given in
Table 1 for one of the images.

4.2

Line Membership

The line membership corresponding to the three orthogonal vanishing points are shown in Figure
5 for one test image. Figure 6 and 7 shows the line memberships for numerous other images tested
using our implementation.

4.3

Execution Time

The execution time in seconds for images of different sizes is given in Table 2. This clearly shows
that Tardif’s algorithm is better computationally than Rother’s.

5

Figure 3: Three orthogonal vanishing points detected using Rother’s algorithm. The orthocenter of
the three vanishing points is the principal point of the image

Figure 4: Three orthogonal vanishing points detected using Tardif ’s algorithm. The orthocenter of
the three vanishing points is the principal point of the image
Image Size
640 × 480
1000 × 747
2592 × 1932

No. of Line Segments
66
153
261

Rother’s
0.56
22.7
–

Tardif ’s
0.06
0.26
1.01

Table 2: Execution time in seconds for different image sizes and line segment numbers
6

(a) Rother’s Algorithm

(b) Tardif’s Algorithm

Figure 5: Line membership for the three orthogonal vanishing points

4.4

Implementational Details

In order to implement and test the vanishing point detection algorithms, we have used C/C++ as
the base programming language. The OpenCV library was used to implement common computer
vision algorithms. MATLAB was used for visualization and testing purposes. All the code written
along with additional information is available online1 .
The choice of the above tools, with the exception of MATLAB, has been largely motivated by
their availability under free and open source licenses. We also plan to submit our implementation
of the algorithms for inclusion in the OpenCV library.

5

Conclusion

In this project, we have implemented Rother’s and Tardif’s vanishing point detection algorithms
and have applied them to detecting three orthogonal vanishing points in indoor scenes. We have
compared the two algorithms through both theoretical analysis and extensive experiments. Rother’s
method can find three orthogonal vanishing points without calibration of the camera, but it is
computationally expensive. Tardif’s method uses J-linkage algorithm for vanishing point detection
which is efficient, but it requires calibration of the camera to find the three orthogonal vanishing
points. So we have proposed a hybrid method by combining Rother’s and Tardif’s algorithms, which
can detect three orthogonal vanishing points efficiently without calibration of the camera. As an
extension to this project, we have explored the possibility of using the three orthogonal vanishing
points to calibrate the camera, estimate relative orientation of the camera with respect to the scene
and ultimately to find the pose of objects in the scene.
1

http://www.umich.edu/~srinaths/courses/442/project/

7

Figure 6: Line membership for the three orthogonal vanishing points using Rotherâ&#x20AC;&#x2122;s Algorithm

Figure 7: Line membership for the three orthogonal vanishing points using Tardif â&#x20AC;&#x2122;s Algorithm
8