ARTICLE

Retinex Theory and Algorithm

The Retinex theory, as originally developed by Land and McCann [1], can be seen as fundamental theory for several state-of-the-art intrinsic image algorithms. This article discusses a mathematical formulation of the Retinex algorithm and also presents experimental results.

The problem of estimating intrinsic images, that is reflectance and shading, from a single image has already been addressed in 1971 by Land and McCann [1] as well as in 1978 by Barrow and Tenenbaum [4]. Researchers within the computer vision and image processing communities agree that solving this problem would be beneficial for many problems in vision and graphics. After discussing some related work and important datasets in the article Intrinsic Images – Introduction and Reading List, this article will introduce a mathematical formulation of the Retinex theory as originally developed by Land and McCann [1]. Further, we introduce the partial differential equation formalization given by Morel et al. [28] which can easily be implemented in practice.

Instead of discussing related work in detail, we simply point out some recent literature on intrinsic images [7,9-12,14-19,20] and intrinsic video (or image sequences) [21-27] as well as appropriate datasets [7,8,25,29,30].

Note that the problem of intrinsic images as originally stated by Barrow and Tenenbaum [4] also includes other intrinsic characteristics as for example depth, surface orientation, occlusion and motion boundaries or optical flow.

Retinex Theory and Algorithm

The Retinex theory was introduced by Land and McCann [1] in 1971 and is based on the assumption of a Mondrian world. This refers to the paintings by the dutch painter Piet Mondrian which, for example, look as depicted in figure 1. Land and McCann argue that human color sensation appears to be independent of the amount of light, that is the measured intensity, coming from observed surfaces [1]. Therefore, Land and McCann suspect an underlying characteristic guiding human color sensation [1]. Note that this characteristic is not directly related to the intrinsic images described by Barrow and Tenenbaum [4], however, as stated by Land and McCann, this characteristic has to be related to reflectance.

Figure 1 (click to enlarge): An illustration of the intuition behind the Retinex theory. Left: An example of the typical style of Mondrian's paintings. Right: The path-based reflectance computation as proposed by Land and McCann [1].

The intuition behind the Retinex theory, illustrated on the example of a Mondrian image with artificial, smooth shading, can be described as follows. The reflectance ratio of two selected patches within the Mondrian image can be determined by following a path between them. Along the path, we multiply the measured intensity ratios. At the border of patches, the intensity ratios will correspond to reflectance changes, while within the patches, the intensity ratios will be close to one. Land and McCann [1] propose to threshold these ratios to suppress intensity changes due to shading, which is assumed to be smooth. The final ratio will be equal to the actual reflectance ratio between the two patches.

The formalization of the described algorithm, as well as its extension to color images, is due to Horn [3] and Blake [5], however, we follow Morel et al. [28]. The reflectance for each pixel in channel $I_c$ is computed by considering a family $\{x_{k}\}_{k = 1}^{K_j} \subseteq \underline{H} \times \underline{W}$ of $J$ paths. For a given pxel $x_n \in \underline{H} \times \underline{W}$, all of these paths start at $x_n$ and end at an arbitrary pixel within the image. Then the reflectance $R_c(x_n)$ of pixel $x_n$ in channel $c$ is computed as

The main problem of this approach is the problem of choosing the paths $\{x_{k}\}_{k = 1}^{K_j}$ such that equation (1) both can be computed efficiently and calculates the proper reflectance (that is, a good approximations thereof).

Partial Differential Equation Formalization

Morel et al. [28] propose a partial differential equation formalization of the Retinex theory. We follow [28] and discuss both the basic framework and the practical implementation.

We consider a family of $J$ random paths $\{x_{k}\}_{k = 1}^{K_j} \subseteq \underline{H} \times \underline{W}$, each of which starts at pixel $x_n$ and ends at a randomly chosen pixel. The random paths are defined on the plane $\mathbb{Z}^2$, however, are reflected on the image borders, resulting in well-defined paths within the image plane. Then, the lightness of pixel $x_n$ in channel $I_c$, in analogy to equation (1), can be written as

The Poisson equation (4) can be solved efficiently using discrete Fourier transform [28,31]. We follow [31] and briefly derive the necessary equations. The discrete, two-dimensional Fourier transform of the function $F_c$ is given as

Applying the inverse, discrete, two-dimensional Fourier transform to retrieve $I_c$ solves the Poisson equation (4) in the case of periodic boundary conditions. However, replacing equations (5) and (6) with the discrete, two-dimensional cosine transform, see [31], serves the constraints of Neumann boundary conditions as in equation (4).

Generalization to Video

The Retinex algorithm as implemented by Morelet al. [28] can easily be generalized to video. The Poisson equation, as formualted for the image $I$ in equation (4), equally holds for the video $V$ and equations (5) and (6) may easily be generalized to three dimensions. Only the right hand side $F_c(x_n)$, $X_n \in \underline{H} \times \underline{W} \times \underline{T}$, of equation (4) needs to be adapted as follows:

Implementation

An implementation is provided by Limare et al. [32], available on GitHub, and uses FFTW3 [33] to compute the discrete cosine transform in two or three dimensions. Each image channel $I_c$ is processed independently. An OpenCV [34] wrapper can also be found on GitHub.

The main limitation when generalizing to video is the temporal segmentation needed due to memory limitations. However, the algorithm can be applied on short, overlapping subsequences which are fused afterwards. In the best case, these overlapping regions could be encoded as constant boundary condition for the Poisson equation (4).

Results

Figure 2 shows the influence of the threshold $\tau$ on two images, one taken from the dataset by Bell et al. [7] and another one taken from the dataset by Beigpour et al. [29]. Further, figure 3 compares the Retinex algorithm to the approach by Bell et al. [7], available on GitHub. Note that shading can, theoretically, not be recovered without knowing illumination. However, Bell et al. [7] assume white illumination to approximate shading and their algorithm, therefore, also returns shading which is excluded for comparison.

Figure 2 (click to enlarge): The influence of the threshold $\tau$ on an example image from the dataset by Bell et al. [7] (top) and the dataset by Beigpour et al. [29] (bottom). From left to right: $\tau = 5$, $\tau = 10$, $\tau = 15$, $\tau = 20$, $\tau = 25$. As can be seen, more and more artifacts occur while increasing the threshold. Therefore, we can deduce that global thresholding of gradients can only be seen as first step towards intrinsic images as is also illustrated by figure 3.

Figure 3 (click to enlarge): Comparison of the Retinex algorithm and the algorithm by Bell et al. [7] using several images from the dataset by Bell et al. [7].

Discussion

The Retinex algorithm can be seen as the first approach to intrinsic images. Therefore, as seen in figure 3, the results are not comparable to state-of-the-art approaches as for example provided by Bell et al. [7]. However, the assumptions made by the Retinex theory, in particular smooth shading and piece-wise constant reflectance, are still used in many recent approaches. State-of-the-art approaches even add additional assumptions as for example sparsity of reflectance [9,13] or use additional information as for example depth [10,21] or video [21,25]. Overall, the problem is inherently global such that local decisions as made by the Retinex algorithm give poor results. It would be interesting to investigate whether the global problem, especially regarding complex scenes, can be decomposed into simpler local problems which are easier to solve. Further, for several applications, we need to ask whether true reflectance is actually necessary to improve performance of subsequent tasks. For example in image segmentation, reflectance and shading provided by a state-of-the-art algorithm can also be seen as additional feature without the requirement of begin true reflectance and shading.

[1] E. H. Land and J. J. McCann. Lightness and retinex theory. Journal of the Optical Society of America, 61(1):1-11, January 1971.

[2] E. H. Land. Recent advances in retinex theory and some implications for cortical computations: Color vision and natural images. In National Academy of Sciences of the United States of America, volume 80, pages 5163-5169, August 1983.

ABOUTTHEAUTHOR

After submitting my master thesis last week, my time at the Max Planck Institute in Tübingen is coming to an end. I will, however, not leave the Max Planck Institute completely. Instead, starting in October, I will start a PhD position at the Max Planck Institute for Informatics in Saarbrücken. Advised by Prof. Bernt Schiele, I will continue research in computer vision and deep learning.
09thOCTOBER2017 , David Stutz

What is your opinion on this article? Did you find it interesting or useful? Let me know your thoughts in the comments below or get in touch with me: