Recording images using a camera is equivalent to mapping
object point O in the object space to
image point I' in the film
plane (Fig. 1a). For
digitization,
this recorded image will be projected again to image I
in the projection plane (Fig. 1b).

Fig. 1

But, for simplicity, it is possible to directly relate the projected image and
the object (Fig. 2). Object O is mapped directly to
the projected image I.
The projection plane is calledimage
plane. Point N is the new node or projection center.

Fig. 2

Two reference frames are defined in Fig. 2:
object-space reference frame (the XYZ-system) and image-plane reference frame (the UV-system). The
optical system of the camera/projector maps point O
in the object space to image I
in the image plane. [x, y, z]
is the object-space coordinates of point O while
[u, v] is
the image-plane coordinates of the image point I. Points I, N
& O thus are collinear. This is the so-called collinearity condition,
the basis of the DLT method.

Now, assume that the position of the projection center (N) in the object-space reference frame to be [xo, yo, zo]
(Fig. 3).
Vector A drawn from N
to O then becomes [x
- xo, y - yo, z - zo].

Figure 3

Add axis W to the image plane reference frame as the third axis to
make the image-plane reference frame 3-dimensional (Fig. 4).
The W-coordinates of
the points on the image plane are always 0, and the 3-dimensional position of point I becomes
[u, v, 0].

Fig. 4

A new point
P, the principal point,
was introduced in Fig. 4. The line drawn from the
projection center N to the image plane, parallel to axis W
and perpendicular to the image plane, is called the principal axis
and the principal point is the intersection of the principal
axis with the image plane. The principal
distanced is the distance between points P
and N. Assuming the image plane coordinates of the principal point to
be [uo, vo, 0], the position of point N
in the image-plane reference frame becomes [uo,vo,d]. Vector B drawn from point N to I is
becomes [u-uo, v-vo,-d].

Since points O, I, and N are
collinear,
vectors A (Fig. 3)
and B (Fig. 4)
form a single straight line. The collinearity condition is simply equivalent to
the vector expression

,
[1]

where c = a scaling scalar. Note here that vectors A
and B were originally described in the object-space reference frame and the
image-plane reference frame, respectively. In order to directly relate the
coordinates, it is necessary to describe them in a common reference frame. One
good way to do this is to transform vector A
to the image-plane reference frame:

,[2]

where A(I) = vector A
described in the image-plane reference frame, A(O) = vector A
described in the object-space reference frame, and TI/O
= the transformation matrix from the
object-space reference frame to the image-plane reference frame. Apply [2]
to [1]:

,
[3]

or

.
[4]

From [4], obtain

[5]

Substitute [5] for c
in [4]:

. [6]

Note that u, v, uo &
vo
in [6] are the image plane coordinates in the
real-life length unit, such as cm. In reality, however, the
digitization system may use different length units, such as pixels,
and [6] must accommodate this:

,
[7]

where = the unit
conversion factors for the U and V axis, respectively. As a
result, u, v, uo & vo
in [7] can be in any units. Also note the two
unit conversion factors in [7] can be different
from each other.

Now, rearrange [7] for x,
y, and z:

,
[8]

where

.
[9]

Coefficients L1 to L11 in
[8] are the DLT parameters
that reflect the
relationships between the object-space reference frame and the image-plane reference frame.

[8] is the standard 3-D DLT equation,
but one may include in [8] the
optical errors from the lens:

,
[10]

where = the optical
errors. Optical errors can be expressed as

,
[11]

where,

.
[12]

Among the five additional parameters shown in [11], L12 - L14
are related to the optical distortion while L15 & L16
are for the de-centering distortion (Walton, 1981):

Parameters

Remarks

L1
- L11

Standard DLT
parameters

L12
- L14

3rd-, 5th-, and 7th-order
optical distortion terms

L15
& L16

de-centering distortion terms

There are two different ways to use [9] in 3-D
DLT method: camera calibration and raw coordinate computation.

Camera Calibration

Rearrange [10] to obtain

,
[13]

where

.
[14]

[13] is equivalent to

.
[15]

Expand [15] for n control points:

.
[16]

In [16], it was assumed that
the
object-space coordinates, [xi, yi, zi],
were all known. A group of control points whose x, y & z coordinates are
already known must be employed for this. The control points must not be
co-planar. In other words, the
control points must form a volume, the control volume.
The control points are typically fixed to a calibration frame or control
object.

In case less than 16 parameters must be used, discard
the unused rows and columns from [16]. Feasible
choices are 11, 12, 14, and 16.

Note also that the coefficient matrix in [16]
requires Ri. which is a function of L9
- L11. It is impossible to directly solve this system
and an
iterative approach must be used. L9 - L11
obtained from the previous iteration can be used in computing Ri
in the current iteration.

To obtain the DLT parameters and the additional parameters using the
least square method, [16] must be over-determined (number of
equations > number of unknowns). Since each control point provides 2
equations, the minimum
number of control points required are

No. of
Parameters

Minimum No. of
Control Points

11

6

12

6

14

7

16

8

Reconstruction

Rearrange [10] for x, y
& z:

,
[19]

where

.
[20]

[19] is equivalent to the
matrix expression

.
[21]

Expand [21] for m
cameras:

,
[22]

where

,
[23]

and .
Again, the least square method described in [17]
and [18] can be used in computing the 3-D coordinates of the
markers on the subject's body.

Camera Position and the Principal
Point

From [9]:

,
[24]

or

.
[25]

Similarly, from [9]:

,
[26]

and

.
[27]

Both [26] and [27]
are based on the orthogonality of
the transformation matrix TI/O:

To obtain the transformation matrix in [29], du and dv
must be computed.
From [29] and [28]:

.
[30]

It is safe to assume
in [30]. However, D can be either
positive or negative, as shown in [26]. Use the
positive value in [29] first and compute the
determinant of the transformation matrix obtained. If the determinant is
positive (right-handed system), D must be positive and the current
matrix is all right. If the determinant is negative (left-handed), D
must be negative. Multiply -1 to the matrix obtained previously. Three Eulerian angles may be computed from the nine
elements of the transformation matrix. See the Eulerian Angles
page for details.

Note here that the DLT parameters computed using the
least square method, [18], does not
automatically guarantee an orthogonal transformation matrix due to the experimental errors. This is
an intrinsic problem that the DLT method has. The Modified
DLT method proposed by Hatze (1988) addressed this
problem. See the Modified DLT page for details.

In the case of 2-D analysis, the z-coordinate is always
0 and the mapping from the object-plane
reference frame into the image-plane reference frame reduces to

.
[31]

Apply [31]
to n (n >= 4) control points and m (m >= 1)
cameras:

,
[32]

and

,
[33]

where

.
[34]

The object plane and the image plane do not have to be
parallel. 2-D DLT guarantees accurate plane- to-plane mapping regardless of
the orientation of the planes. The control points must not be collinear and
must form a plane.

The accuracy of camera calibration and reconstruction can
be assessed by computing the calibration error and/or the reconstruction error.
The calibration error of a given camera is defined as

.
[35]

The DLT and additional parameters obtained through the
calibration can be applied back to the control points for the computation of
their reconstructed coordinates. The reconstruction error is the deviation of
the reconstructed coordinates from the measured:

,
[36]

where
= reconstructed coordinates of the control point.

A self-extrapolation scheme can be employed to improve the
reliability of
the calibration/reconstruction error (Kwon, 1989). Only half of the control points
are used in the computation of the parameters while all points are used in the
reconstruction. The reconstruction error computed in this way is more reliable
and better reflects the actual object-space deformation error. The minimum number of control points required for
self-extrapolation depends on the no. of parameters:

The calibration frame must be large enough to well
include the space of
motion. If it is too small, there is a danger of extrapolation and, as a
result, inaccurate coordinate computation.

Once the recording for camera calibration is done, you should not
alter the camera setting including focus.

Include as many control points as possible and spread them uniformly
throughout the control volume. This will increase the redundancy of the system
([16]) and improve the accuracy of the
calibration.

Use as many cameras as possible
to increase the redundancy for the reconstruction ([22]). But
avoid positioning cameras facing to each other or at least avoid using these camera
combinations in the reconstruction. Using different camera combinations in
the reconstruction of the unknown markers throughout the entire length of trial may
cause discontinuity in the coordinate-time curves. This is mainly because of the
experimental errors involved in the digitizing process.

Set the control object (calibration frame) properly to align the axes
well in relation to direction of the motion. If it is difficult to align the horizontal
axes properly in 3-D analysis, keep at least the Z axis vertical. This will
simplify the subsequent axis-alignment substantially.

Kwon, Y.-H. (1989).
The effects of different control point conditions on the DLT
calibration accuracy. Unpublished class project report, Pennsylvania State
University.

Marzan,
G.T. & Karara, H.M. (1975). A computer
program for direct linear transformation solution of the collinearity condition, and some
applications of it. Proceedings of the Symposium on Close-Range
Photogrammetric Systems (pp. 420-476). Falls Church, VA: American Society of
Photogrammetry.