To preserve the spatial consistency of low-level features, generalized Riesz-wavelet transform (GRWT) is adopted for fusing multi-modality images. The proposed method can capture the directional image structure arbitrarily by exploiting a suitable parameterization fusion model and additional structural information. Its fusion patterns are controlled by a heuristic fusion model based on image phase and coherence features. It can explore and keep the structural information efficiently and consistently. A performance analysis of the proposed method applied to real-world images demonstrates that it is competitive with the state-of-art fusion methods, especially in combining structural information.

Image fusion [1] is a unified process to provide a global physical scene from multi-source images. It commonly contains spatial or temporal representation. This technology has been applied to different application domains, such as medical imaging, night vision and environment motoring [1]. Image fusion usually follows into three categories: pixel-level, feature-level and decision-level. It can take advantage of different sensors’ basic features and improve visual perception for various applications. In this paper, we focus on developing a multi-modality image fusion algorithm, including visual, infrared, computed tomography (CT) and magnetic resonance imaging (MRI).

There are various image fusion methods for pixel-level fusion. First, wavelet-like image fusion methods or multi-resolution analysis [1] (MRA) based transformations, were proposed. Discrete wavelet frame [2] (DWF) transform is presented for fusing different directional information. A low redundancy extension of DWF, named by low redundant discrete wavelet frame transform (LRDWF), was proposed by Bo Yang et. al. [3]. Meanwhile, non-separable wavelet frame [4] based fusion method was also proven to be an effective method for the application of remote sensing. Moreover, lifting stationary wavelet [5] combined with pulse coupled neural network (PCNN) was applied to multi-focus image fusion. Several composite image fusion methods [6], based on non-subsampled contourlet transform [22] (NSCT) and shearlet [7], were proposed for different fusion problems. However, it should be noted that NSCT based fusion methods have low space and time efficiency. These methods were denoted as multi-resolution geometric analysis (MGA) based fusion methods. Second, sparse representation (SR) based image fusion methods were developed for handling related fusion problems, such as remote sensing image fusion [8]. Third, compressed sensing (CS) was popular for image fusion [9]. In summary, MGA based image fusion methods commonly lead to contrast reduction or high memory requirement. The SR based fusion methods exist some problems, for example time-space complexity and the blocking-effect [1]. The CS based image fusion may suffer from the reconstruction error.

Recently, dynamic image fusion is assumed to be a general challenging problem and has gained lots of attention. This problem was converted into an object detection based dynamic image fusion [10]. Lately, some researchers focused on the scheme of fusing spatial-temporal information [12, 13]. First, Kalman filtered compressed sensing [11] (KFCS) was presented for dealing with this problem. This method can capture spatial-temporal changes of multi-source videos by separable space-time fusion method. Second, Surfacelet transform was applied to video fusion by utilizing its tree-structure based directional filter banks [12]. Third, temporal information [13] was considered for providing more presented formation for the interest regions of the observation of the satellite in the field of remote sensing. Despite these works, the fusion pattern, considering spatial and temporal direction or other information, has not been explored fully. Beyond these problems, the spatial or temporal consistency remains as a challenging problem.

To preserve the spatial consistency and fuse structural information effectively, a novel image fusion method, based on generalized Riesz-wavelet transform [14], is proposed. Its main idea is to develop a heuristic fusion model, based on the capability of GRWT, to combine structural information adaptively and consistently. Its main feature lies in providing a generalization representation of low-level features based fusion pattern, which can be extended to other fusion problems easily. Meanwhile, the integration of high order Riesz transformation and the proposed heuristic fusion model can keep and implement the fusion of structural information, such as gradient, contour or texture. This fusion pattern can detect and pick up low-level features by utilizing high-order steerablility and its excellent angular selectivity [14]. Different from other MGA based image fusion methods, the GRWT based fusion method can improve the ability to keep the spatial consistency in high dimensions. The real-world experiments demonstrated that GRWT based fusion method achieves a fusion performance improvement, especially on the consistency of structural information.

The rest of the paper is organized as follows. A summary of generalized Riesz-wavelet transform is described in Section 2. Section 3 presents the details of the proposed heuristic fusion model. Section 4 presents experimental results based on multi-modality images. At last, discussions and conclusions are presented in Section 5 and 6, respectively.

2. Generalized Riesz-wavelet transform

- 2.1. Riesz transform and its high order extension

Riesz transform [14] can be viewed as a natural extension of Hilbert transform. It is a scalar-to-vector signal operation. For the Hilbert transformation, it performs as an all-pass filter, whose transfer function can be defined as follows

PPT Slide

Lager Image

where

PPT Slide

Lager Image

(w) stands for the transfer function in frequency-domain, T(w) for the space-domain version, and w for the frequency variable. Based on the definition of Hilbert transform, Riesz transform can be defined in the frequency-domain as follows

PPT Slide

Lager Image

where

PPT Slide

Lager Image

(w) is the Fourier transform of the input signal s(w) and Ri[s(w)] denotes Riesz operator on the signal s(w). The representation of this transformation in space-domain can be reformatted as follows

PPT Slide

Lager Image

where d denotes the dimension of Ri[s(x)], and the filters

PPT Slide

Lager Image

are denoted as the resulting frequency responses Tn(w)=-jwn/∥w∥. The space-domain based equation can be expressed directly as follows

PPT Slide

Lager Image

where F-1(•) denotes the inverse fast Fourier transformation. Eq. (4) can be viewed as the impulse response of isotropic integral operator (-Δ)-1/2. More exactly, these operations, performed by Riis(x)s, can be viewed as partial derivatives of y(x). Table 1 demonstrates the connection between the differential operators and the Riesz transformation.

Summary of Riesz transform and other differential operators[14]withd= 1,2,3,4.

PPT Slide

Lager Image

Summary of Riesz transform and other differential operators [14] with d = 1,2,3,4.

Remark: Riesz transformation has a natural connection to part derivative or gradient operator. The details about this method can refer to the paper [14].

The higher-order Riesz transform with respect to the input signal is defined as follows

PPT Slide

Lager Image

where i1,i2,,...,iN ∈ {1,...,d} denotes N-th individual signal components with different order Riesz transformation. There exist dN ways to construct N -th order terms. The directional behavior of generalized Riesz-wavelet transform can be obtained as follows

PPT Slide

Lager Image

where Y denotes Fourier transform version of y(x), and u = [u1,...,ui,...,ud] is the unit vector for angular selectivity [14]. Iterating N times, the previous equation can be expressed as follows

PPT Slide

Lager Image

The formation of space-domain, corresponding to previous Eq. (7), can be expressed as follows

PPT Slide

Lager Image

where cn1,...,nd(u) denotes the steering coefficients by N-th order Riesz transform. Moreover, n1,...,nd denotes multi-index vector for representing the N -th order Riesz transform. cn1,...,nd(u) can be obtained by

PPT Slide

Lager Image

- 2.2. 2D Generalized Riesz-wavelet transform

The 2D-version generalized Riesz-wavelet transform [14] can be given directly based on the definition of Riesz transform in Subsection 2.1. For N th-order Riesz transform with d = 2 , there are N + 1 individual components. This equation leads to the subspace

PPT Slide

Lager Image

N,2 = span

PPT Slide

Lager Image

. The explicit formation of

PPT Slide

Lager Image

n1,N-n1(w) can be given as follows

PPT Slide

Lager Image

where we can compute cos(θ) = w1 /

PPT Slide

Lager Image

, and sin(θ) = w2 /

PPT Slide

Lager Image

in the system of polar coordinate. These basic functions have a more simple formation (2π-periodic radial profile functions [14]), which can be expressed as follows

PPT Slide

Lager Image

where z = ejθ , m,n are the orders of the frequency responds of different Riesz components, and the sum of these orders is equal to N. The correspond wavelet basic function or directional analysis based Wavelet transformation can be obtained as follows

PPT Slide

Lager Image

where ψ > d / 2 is the order of the wavelet. The function θ2ψ(2x) is the B-spline of order 2ψ , which is a smoothing kernel that converges to a Gaussian as ψ increases [14]. Fig. 1 presents the flow chart of GRWT. Based on Eq. (12), the Riesz-wavelet coefficients

PPT Slide

Lager Image

, can be obtained by

PPT Slide

Lager Image

where k stands for the location of multi-resolution transformation. Similarly, i stands for the current decomposition level.

In summary, the mapping pattern provided by generalized Riesz-wavelet transform [14] preserves image structure with the L2-stability of the representation. This property can keep the multi-scale decomposition process from the blocking-effect and introducing artificial artifacts. Moreover, this operation doesn’t amplify the high frequency components. This decomposition is fast with a moderate redundant representation [14] compared to NSCT based signal decomposition method.

3. Heuristic fusion model and framework

- 3.1 Heuristic fusion model

In this subsection, the proposed fusion model is proposed. It is named with heuristic fusion model, which can be expressed generally as follows

PPT Slide

Lager Image

where

PPT Slide

Lager Image

is the degradation factor for the feature space ui .

PPT Slide

Lager Image

is not only able to project ui into a suitable scale, but it also can weigh the ui discriminatingly. ui determines the importance of the correlation of the feature space fi, i=1,...,N . The lesser values of σ2 represent less importance of ui, while larger values correspond to more importance. The feature spaces are assumed to be leaded into a tensor based on the distance of the input low-level features, i.e., ui = Dist(f1,f2) . Feature spaces f1 and f2 are established from each subband of GRWT from multi-modularity images. In this parameterization, the selection of ui is denoted as the strength of the feature spaces’ interaction.

For example, the fusion coefficient CV, obtained by the visual images, can be presented explicitly by

PPT Slide

Lager Image

where u1 and u2 denote the strength of the feature spaces of image phase and coherence respectively, which can be extracted from the input multi-modality images, which located in each subband of GRWT. CV denotes the fusion coefficient, which is calculated for the visual image. The determination of

PPT Slide

Lager Image

and

PPT Slide

Lager Image

is corresponded to the selection of the feature space, especially based on the volume.

Remark: It can be noted that the proposed heuristic fusion model is a generalized representation of the feature-based image fusion pattern. The types of low-level features may include image phase, coherence, orientation and regions, etc. In this paper, two feature spaces are generated via image phase denoted by uP, and image coherence by uC. These features can span a tensor-based feature spaces, which may provide a potential research direction to develop a heuristic or additive feature based fusion model.

At last, according to a common assumption in the context of image fusion [1], the sum of all the fusion coefficient Ci in the fusion process is equal to 1. For example, the fusion process of the visual and infrared images can be expressed mathematically

PPT Slide

Lager Image

where the symbols V and I are abbreviated with visual and infrared images respectively. These fusion coefficients are determined by Eq.(15). In other words, the weighting process of the fusion coefficient Ci can be viewed as a convex combination.

- 3.2. Proposed fusion method

In this section, the fusion process is proposed through GRWT. The flow chart of the proposed fusion framework is presented in Fig. 2. We assume that the input image signals are spatially registered. The main workflow is summed up as follows

1. Input two images, denoted by imageAandB.

2. Registering the input two images.

3. Selecting the order of 2D-Riesz transformation. Its definition can be referred to Eq.(5).

4. Choosing the number of decomposition levels of GRWT, which determines the number of subbands.

5. Analyzing multi-modality imagesAandBby GRWT based on Eq.(12) and Eq.(13). This process produces various transformation coefficients in different scales.

PPT Slide

Lager Image

whereDCAiandDCBidenotes the decomposition coefficients of imageAandBrespectively, based on GRWT, in scalei.

6. Extracting the corresponding feature space based on the decomposition coefficients in each decomposition scale, which can be expressed as follows

PPT Slide

Lager Image

The functionPhase(•) is aimed at extracting the image phase based on decomposition coefficients of imageAandB. Similarly, the functionCoherence(•) is a function for calculating the image coherence.PCAiandPCBidenotes the phase coefficients of imageAandBrespectively.CCAiandCCBiare the image coherence coefficients of imageAandBrespectively. The details steps to obtain the information of image phase and coherence can refer to the reference[21].

7. Based on these definitions, the featuresu1andu2can be obtained by

PPT Slide

Lager Image

where the functionDist(•,•) denotes the distance between two coefficients matrix. In this paper, this distance is chosen to be chessboard distance[23]because of the larger value indicates the better stability of the fusion images.

8. Reconstructing the fusion image by the fused coefficientsFCiin scale, the fusion procedure can be summed as follows

PPT Slide

Lager Image

PPT Slide

Lager Image

where the symbol e denotes the Hadamard product .

9. Based on the fused coefficientsFCi, the resulting fusion imageFcan be generated by

PPT Slide

Lager Image

whereidenotes the scale or the decomposition level,k∈Z2stands for the location, i.e., (x,y), and the symbol(x) denotes then-th order Riesz-wavelet basis function[14].

To assess the effectiveness of the proposed fusion, empirical experiments are performed on multi-modality images. Five objective indexes are taken for the evaluation of fusion performance, which consist of entropy [1] (EN), mutual information [1] (MI), structural similarity index [15] (SSIM), feature similarity index [16] (FSIM) and edge information preservation index [17] denoted by Qab/f . The definitions of FSIM, EN and MI can be referred in Appendix A. It should be noted that the larger values of these evaluation indexes indicate that the better fusion performance performed by the fusion methods. For the FSIM and SSIM, they would produce two numerical results based on two referred input images. In this paper, the larger ones are selected for representing the real fusion capability of the referred fusion methods. It should be noted that the completion and fidelity of structural information of the fusion image is important to the success of image fusion. To evaluate the computational cost of these fusion methods, the execution time, denoted by Time(s), is also taken for assessing the time complexity of all referred fusion methods in second(s). The experiments are performed on a computer shipped with Intel Core Quad CPU Q6700, and equipped with 3G RAM. All algorithms are implemented in Matlab 2010.

The proposed method is compared to five fusion methods, including wavelet, dual tree-complex wavelet transform (DT-CWT), low redundant discrete wavelet frame transform [3] (LRDWF) and discrete wavelet frame transform [2] (DWF) and Shearlet [19]. For the referred multi-scale decomposition methods, the decomposition levels are chosen to be four, the fusion rule is the selection of the largest decomposition coefficients, and the basic function is chosen to be ’db4’. For the proposed GRWT based fusion method, when the order of Riesz transform is 1, and the decomposition level is 4, it gains a better fusion performance in our experiments.

- 4.2 Fusion results by navigation images

To assess the proposed method in different imaging situations, we test the fusion methods on navigation image. These images were captured in visual and infrared image. The samples of the navigation image are displayed in Fig. 3. The size of navigation images is 512 × 512. The resulting visual fusion results are presented in Fig. 4. The numerical results are presented in Table 2. It can be seen that GRWT based method can construct a more complete representation of the perceived scene than other fusion methods. Although the visual result of Shearlet based method is similar to the fine detail of GRWT based method, the numerical results indicated that Shearlet’s fusion process, based on Qab/f, SSIM and FSIM, may damage the perception of the local image content. Clearly, the proposed method outperforms the other fusion methods because of the proposed fusion model and its ability to select and reconstruct structure information. At last, it can be concluded that the presented fusion performance validated the effectiveness of the proposed fusion method again.

GRWT based fusion method can make a balance between the fusion performance and computation requirement. Compared to Wavelet, LRDWF, DT-CWT and DWF based fusion methods, Shearlet based fusion method completed higher fusion performance in term of EN, MI, SSIM and FSIM. In a numerical view of these fusion processes, Shearlet [19] based fusion method required 9.6650 second, but GRWT based fusion only required 0.9135. Compared to the other fusion methods, the proposed fusion algorithm just slightly higher than them. It can be seen that Wavelet based fusion method required the least time about 0.1480 second, but its overall fusion performance is the lowest among compared fusion methods.

- 4.3 Fusion results by visual and near-infrared images

In this subsection, some experiments are performed for assessing the fusion performance on visual and near-infrared images. The dimension of these two images is 256 × 256. The samples of these two images are presented in Fig.5. The numerical outcomes are illustrated in Table 3. Although Shearlet based method’s visual contrast is higher than GRWT’s, its numerical evaluation results indicate that the fusion process by Shearlet based method may destroy local structures of the scene, which results in low fusion performance in term of EN, Qab/f , SSIM and FSIM. In summary, the fusion performance, obtained by GRWT based method, demonstrated that its fusion pattern can capture and select natural structural information or low-level features, and complete a better fusion performance improvement. Meanwhile, it can be seen that the compution time of the proposed fusion method behaves as the section 4.2. The visual results are displayed in Fig. 6. It should be noted the fusion result of GRWT based method contains more contrast details than other fusion methods. The visual results of Wavelet, LRDWF, DT-CWT and DWF lost some contrast information originated in the original scenes. The reason for these outcomes is that the smoothing effect of these multi-scale decomposition methods would discard some contrast details in some degree. In other words, these methods suffer from the loss of local contrast or the damage of structural information. The numerical evaluation results indicate the proposed fusion method preserves more high-order structural information, such as gradient and texture. It is clear that the fusion image created by GRWT based method is superior to other fusion methods.

Visual comparison of six fusion methods by visual and near-infrared images

- 4.4 Fusion results by medical images

In this section, the proposed method’s fusion performance is assessed numerically and inspected visually on CT and MRI images. The sample images are displayed in Fig. 7. The size of these images is 256 × 256. It should be noted that our method is better than other fusion methods in term of EN, SSIM, FSIM and Qab/f based on Table 4. These numerical results indicated that the proposed fusion method can transfer more information than other methods. Moreover, the computational requirement of the proposed fusion method almostly reaches to the outcomes of the DT-CWT and DWF based fusion methods. For a visual examination of the fused image by GRWT, displayed in Fig. 8, it can be seen that GRWT based fusion method contains more salient or details than other fusion methods. In other words, different features from these two images are combined into Fig. 8(f). This phenomenon, generated by GRWT based methods, can be supported by the real fusion performance measured by SSIM, FSIM and Qab/f. The definitions of these indexes indicate that structural information, such as the information of gradient, texture and edge, is integrated into the final fused result efficiently. In other words, the proposed method can complete a comprehensive fusion performance improvement with regard to objective evaluation indexes and visual effects.

To investigate the fusion performance fully, the proposed method was assessed on a classical public dataset, which was captured by a Canon camera with appreciating modifications. All fusion methods were applied to 20 pair images. The details of sampling the near-infrared image [20] can refer to the web site: http://ivrg.epfl.ch/research/infrared/imaging. The original size of these images is 1024 × 679. To benefit the evaluation process of all fusion methods, the input images are cropped into a square size with 512 × 512. These images are presented in Fig. 9. Meanwhile, this section would not display the resulting fusion images for the sake of the paper space. Table 5 presented the average numerical results based on 20-pair visual and near-infrared images. It can be seen again that GRWT based method outperforms other fusion methods in term of five objective fusion indexes. For the MI index, GRWT based method behaves much better than other fusion methods. Other four evaluation indexes, i.e., MI, SSIM, FSIM and Qab/f, indicate that GRWT based method can transfer more structural information into the fusion results. Similar to the execution times in Table 2, 3 and 4, the proposed fusion method can accomplish a balance between the fusion performance and computational cost again.

A general challenging problem in MRA based image fusion methods is the consistency of fusion result, which is owing to the representation accuracy of the MRA based method and the fusion rule. The proposed method can deal with this problem in some degree, which can be validated by visual results in Fig. 4(f), Fig. 6(f) and Fig. 8(f). Visual inspection of these visual results indicates that other MRA based methods may damage the local contrast features or produce visual artifacts. The numerical results, based on Table 2, Table 3 and Table 4, also specified that GRWT based method not only keeps the contrast of the input images, but it also preserves the spatial consistency of contour, gradient and texture. This promising process can be verified by numerical results provided by EN, MI, Qab/f , SSIM and FSIM.

Beyond these, the presented method’s superiority lies in the preservation of image content coherence. The Shearlet based fusion methods may cost about eight times original image’s computational cost, much more than GRWT. In real-time application, these behaviors may not be permitted. The proposed method can alleviate these situations in some degree and have a balance between the complexities of space and time.

6. Conclusion

A novel image fusion method, integrated by heuristic fusion model and generalized Riesz-wavelet transformation, is presented. Exploiting the proposed fusion model’s excellent ability to investigate and select structure information, the proposed method can combine image content efficiently. A variety of experiments illustrated that the congruency of phase and gradient magnitude is important to the success of image fusion method. The numerical and visual results provided by five objective indexes and visual examination, have shown that the presented fusion method is suitable for multi-modality image fusion. Moreover, GRWT based fusion method can capture salient features with sharper intensity changes, and keep the consistency of directional edge and texture.

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant Nos. 61175028) and the Ph.D. Programs Foundation of Ministry of Education of China (Grant Nos. 20090073110045).

BIO

Bo Jin received his B.S. degree in electronic information and electrical engineering from Shanghai Jiao Tong University, China, in 2004. He is currently pursuing the Ph.D. degree in electronic information and electrical engineering at Shanghai Jiao Tong University, Shanghai, China. His major research interests include machine learning, incremental learning, face recognition, and visual tracking, image fusion.

Zhongliang Jing received his B.S., M.S., and Ph.D. degrees from Northwestern Polytechnical University, Xi’an, China, in 1983, 1988, and 1994, respectively, all in Electronics and Information Technology. Currently, he is Cheung Kong professor, and executive dean at the School of Aeronautics and Astronautics, Shanghai Jiao Tong University, China. Prof. Jing is an editorial board member of the Science China: Information Sciences, Chinese Optics Letters as well as International Journal of Space Science and Engineering. His major research interests include multi-source information acquisition, processing and fusion, target tracking, and aerospace control.

Han Pan was born in GuangXi, PR. China in 1983. He received his Ph.D. degree from Shanghai Jiao Tong University, Shanghai, China, in 2014. He is currently a postdoctoral fellow at Shanghai Jiao Tong University, Shanghai, China. His research interests include image restoration, information fusion, and convex optimization.