Abstract

An increasing number of applications require the integration of data from various disciplines, which leads to problems with the fusion of multi-source information. In this paper, a special information structure formalized in terms of three indices (the central presentation, population or scale, and density function) is proposed. Single and mixed Gaussian models are used for single source information and their fusion results, and a parameter estimation method is also introduced. Furthermore, fuzzy similarity computing is developed for solving the fuzzy implications under a Mamdani model and a Gaussian-shaped density function. Finally, an improved rule-based Gaussian-shaped fuzzy control inference system is proposed in combination with a nonlinear conjugate gradient and a Takagi-Sugeno (T-S) model, which demonstrated the effectiveness of the proposed method as compared to other fuzzy inference systems.

1Introduction

The human brain obtains information from different sources; it then merges this information to form concepts and finally outputs natural language (NL), which is powerful and versatile enough to describe the real world. NL can be regarded as the fusion of disparate information; it is vague, ambiguous, and uncertain. The quantitative calculation and qualitative analysis of NL is the ultimate goal of artificial intelligence. There are two strands of research linking the initial information acquisition with NL: (1) how to simplify the presentation of NL and (2) how to form NL from multi-source information. Usually, humans express emotions of certain objects by using sentences and affective words, but they cannot fully express their intuitive perception of an object simply through separating these terms. Natural Language Processing (NLP) was developed to solve this problem; however, many difficulties remain in this field. Computing with Words (CW) was also introduced to decrease the complexity related to linguistic variables [16–18]. This has allowed for a more exact expression of the meaning of what a human is thinking about and has provided a feasible direction for NLP under weakened conditions. Zadeh introduced the framework of this phenomenon of uncertainty using Fuzzy Sets (FS) in 2005 [19]. The FS theory was also addressed to describe objects at a coarse-grained level. Herrera and Martínez [5] introduced a 2-tuple fuzzy linguistic representation model for CW without any loss of information. Furthermore, Lawry [13, 14] proposed Label Semantics (LS) for vague concept modeling and reasoning techniques so as to formalize uncertainty in presentation theory. Subsequently, Lawry and Tang [12, 34, 35] proposed a new semantic understanding model: the Prototype Theory (PT). These works discovered the connection between fuzzy presentation technology and high-level semantics. In engineering fields, linguistic representation models combined with affective words have had some applications, such as fuzzy decision making [21, 31] and KANSEI Engineering (KE). Fuzzy inference methodologies have also been shown to be effective in our previous work on Rough Sets [7] and Fuzzy Support Vector Machines (SVMs) [6].

However, it has been regarded as more feasible to focus on multi-source information fusion rather than on NL itself. Moreover, it is important to discover the mechanics of integrating multi-source information in the human brain. Due to the modular and vague appearance of multi-source information, uncertainty reasoning methods and their associated mathematical tools are thought to offer more interpretability and a much stronger generalization capability [24]. Yager developed the theoretical foundation for multi-source information fusion techniques based on set measure and possibility theories [25, 26]. Normally, single-source information consists of steady features that are more easily formalized and parameterized. In previous studies, the sum, product, max/min, and Weighted Arithmetic Mean (WAM) were used to combine single-source information, and each output represented an independent source of information that could be treated separately [15].

Relative to mathematical research and understanding the phenomenon of uncertainty, the integration of information using fuzzy inference techniques pervades many scientific disciplines, such as multivariate and type-2 fuzzy sets; bipolar models [10, 11]; and probability and possibility issues [9, 27]. Information fusion is the merging of information from disparate sources with differing conceptual, contextual, and typographical representations. It has been successfully applied in data mining and the consolidation of data from unstructured or semi-structured resources, and it has also led to many achievements in various fields [1, 4, 8]. Fusion methods include product fusion (such as the Bayes posterior probability model), linear fusion (SVM classifiers), and nonlinear fusion (super-kernel integration) [23]. Recent developments and applications of fuzzy information fusion can be found in pattern classification, image analysis, decision-making, man-made structures, and medicine [30, 32]. Furthermore, over the past several years, there has been a number of successful applications of fuzzy integrals in decision-making and pattern recognition that have employed multiple information sources [3, 20].

In this paper, we formalize multi-source information as a multivariable group and describe each information structure as a special kind of triple, I = < P, d, ρ >, where P denotes a typical point of positive examples relative to the information structure I, d is a distance measurement that represents the population of information, and ρ is a Probability Density Function (PDF). The basic idea of this formalized information structure is to assume that the neighborhood radius of each information structure is uncertain, which is limited by PDF-ρ. Thus, we will calculate the value of P relative to an information structure under a given level. An information fusion technique was developed by formalizing this special information structure; furthermore, information fusion employing fuzzy sets was applied in this paper. A Single Gaussian Model (SGM) was applied to single-source information, and a Gaussian Mixed Model (GMM) was applied to the fusion of this information by incorporating probabilistic and statistical methods [28, 36].

The remainder of this paper proceeds as follows. In Section 2, we propose an information structure that incorporates a definition of the information kernel, boundary, and Gaussian PDF. An improved algorithm for parameter estimation is also introduced. Section 3 introduces fuzzy similarity relations and IF-THEN rules for this special information structure. These are helpful for calculating the possibilities in a rule-based fuzzy inference system (FIS). Section 4 develops a rule-based information fusion model using a conjugate gradient and Takagi–Sugeno (T-S) model under a rule-based Gaussian-shaped fuzzy inference system (RGS-FIS). A time-series analysis using natural disaster datasets is also introduced using RGS-FIS, and we demonstrate the effectiveness of our method in comparison to other methodologies. Finally, in Section 5, we give our conclusions and ideas for future work.

2Information fusion models by using probability density function

2.1Definitions

Definitions for our information structure and kernel computing method were established as follows.

Definition 1. Assume object Ω is described by the multi-source information set I = {Ik|k = 1, 2, ⋯ , m} and that measure set V = {vk|k = 1, 2, ⋯ , m} is a set of information structures corresponding to set I. For ∀vk ∈ V, we define vk =< Pk, dk, ρk >, where Pk is a typical point as the kernel of Ik. Moreover, dk is a metric of the information structure vk related to the population or scale of information and will be used for boundary computing. Lastly, ρk is a density function on the threshold of vk.

Definition 2. Let the fusion operator be ⊕ so that Ω can be formalized as:

2.2Probability density function

The Gaussian distribution is a continuous probability distribution with a bell-shaped PDF in one-dimensional space:

(6)

f(x,μ,σ2)=12πσe-12(x-μσ)2

The parameter μ is the mean or expectation, and σ2 is the variance. The SGM is applied to induct the density function of the proposed information structure I, and we define:

(7)

δ(X,μ,Φ)=1(2π)n|Φ|e-12(X-μ)TΦ-1(X-μ)

where X is a vector in n-dimensional space, Φ is the covariance matrix, and μ is the mean value of the density function. The density function’s properties are determined by (Φ, μ), so this is a parameter estimation problem [29]. For any point Pi ∈ Rn, its probability density function is δ (Pi, μ, Φ), and if, for any information structure vk, each Pi in vk is regarded as an independent event, then the PDF of vk is:

(8)

δk=δ(vk,μ,Φ)=∏imδ(Pi,μ,Φ)

The maximum likelihood estimation can be used to estimate the parameters (Φ, μ) under (8). Taking the logarithm of (8), we have:

(9)

O(μ,Φ)=ln(∏imδ(Pi,μ,Φ))=∑imln(δ(Pi,μ,Φ))=∑im-n2ln(2π)-12ln|Φ|

+12(Pi-μ)TΦ-1(Pi-μ)]=-nm2ln(2π)-m2ln|Φ|-m2∑i[Pi-μ)TΦ-1(Pi-μ)]

Taking the partial derivative w.r.t. μ of O (μ, Φ) and setting it to 0, we obtain the following:

(10)

∂μ(O(μ,Φ))=-12∑im[-2Φ-1(Pi-μ)=Δ-1∑im[(Pi-μ)]=Δ-1[∑imPi-mμ]=0

This gives μˆ=12∑iPi. Similarly, for Φ, we can obtain Φˆ=1n-1∑i(Pi-μˆ)(Pi-μˆ)T. Thus, if the density of each point in vk is δ(P,μˆ,Φˆ), then our estimation of the parameter μ is:

(11)

μˆ=(1n∑ie1i,1n∑ie2i,⋯1n∑ieni)

where eli is the coordinate of Pi in Rn.

The covariance Φˆ is converted to

(12)

For multi-source information fusion, we need to calculate all of Ik’s density functions as well as calculate the new density function. For m multi-source information structures, let Ifusion=∑i=1lαiδ(P,μi,Φi) for a normalized weight parameter α: i.e., ∑iαi = 1. To calculate and simplify the covariance matrix Φ, let

Then, for Δ=cI→, c ∈ R, GMM is defined as G (P) = ∑iαiδ (P, μi, σi), i = 1, 2, ⋯ , lf. The number of parameters for estimation is 3l. If we let θ=[α1,α2,⋯,αl,μ1,μ2,⋯μl,σ12,σ22,⋯,σl2], the object is that:

(19)

(20)

For αj, under the constraint ∑jαj = 1, we use Lagrange multipliers to re-define the object as:

(21)

J=L(θ)+λ(1-∑i=1αi)=∑iln(∑jαjδ(Pi,μj,σj2))+λ(1-∑i=1αi)

Differentiating this new object w.r.t. αj, we have that:

(22)

∂αjJ=∑iδ(Pi,μj,σj2)∑j=1lαjδ(Pi,μj,σj2)-λ=1αj∑iφj(Pi)-λ=0

(23)

[αˆ1,αˆ2,⋯αˆl]=[1λ∑iφ1(Pi)],1λ∑iφ2(Pi),⋯1λ∑iφk(Pi)]

(24)

αˆ1+αˆ2+⋯+αˆl=1λ(∑i(φ1(Pi)+φ2(Pi)+⋯+φk(Pi))=1

Furthermore, we know λ = l, so:

(25)

[αˆ1,αˆ2,⋯αˆl]=[1lf∑iφ1(Pi)],1l∑iφ2(Pi),⋯,1l∑iφk(Pi)]

where φ is also a function of parameters, and we can resolve this using the following iteration:

Step 1: Let

θ=[α1,α2,⋯αl,μ1,μ2,⋯μl,σ12,σ22,⋯σl2]

Given an initial value and in order to achieve convergence, μ1, μ2, ⋯ μm may be calculated by the cluster method.

Step 2: Calculate φj (Pi).

Step 3: Calculate μ˜j=∑iφj(Pi)Pi∑iφj(Pi).

Step 4: Calculate

σj=1lf∑iφj(Pi)(Pi-μ˜j)T(Pi-μ˜j)∑iφj(Pi).

Step 5: Calculate αj=1lf∑iφj(Pi).

Step 6: Let

θˆ=[αˆ1,αˆ2,⋯αˆl,μˆ1,μˆ2,⋯μˆl,σˆ12,σˆ22,⋯σˆl2] If ∥θ-θˆ∥<δ for a given threshold δ, then stop the process; otherwise, proceed to Step 2.

In actuality, the density function of information fusion under this special structure is a product of the fusion of SGMs. For all information structures vk and their SGM densities δ (vk), δ(Ifusion)=∏kδ(vk); therefore, we have that:

(27)

This is a linear transformation of the basic Gaussian function. Thus, for any two information structures vi, vj, the fusion result is vij =< αPij, βdij, γδij > where α, β, and γ are undetermined coefficients.

3Fuzzy implications of information structure under IF-THEN rules

3.1Fuzzy implications of information structures under IF-THEN rules

In fuzzy sets, the rule “IF x is A¯, THEN y is B¯” indicates a fuzzy implication between A¯ and B¯ as denoted by A¯→B¯. If we let x, y ∈ [0, 1] be the memberships of A¯ and B¯, respectively, we list the Mamdani model for the membership computing as t ∀x, y ∈ [0, 1], F (x, y) = Min {x, y}.

We construct a fuzzy membership based on a new fuzzy implication and inference system. We also derive a similarity relationship and apply this to the Gaussian density function-based fuzzy rule inference system. For δk in a rule-based IF-THEN inference system, suppose that the rule set is:

(29)

(30)

Thus, for any other implication operators, the function of rules will have the form:

(31)

M(x,y)=12πe-12(Ax2+By2+Cx+Dy+E).

4Applications

4.1Mamdani model-based fuzzy control inference system using nonlinear conjugate gradient

In the previous section, information was formalized as vk =< Pk, dk, ρk > where Pk is a central point in Rn, dk ∈ R, and ρk is a Gaussian density function. Pk and dk operate as fuzzy numbers using the fuzzy logical operation in Section 2. In our fuzzy rule-based inference system if different information (multi-source information) implies the same conclusion, then this information is integrated. Supposing that the multisource information structure vk will conclude with a particular assertion at the ρk level, we have that:

(33)

IFΘTHENϖ(Θ).

From the previous section, we know that ϖ (Δ) is a Gaussian density function, so the rule is re-labeled as IF X THEN f (X). For this rule set, we have

Ri:IFXTHENfi(X)

However, as f (X) is a nonlinear function, it is difficult to find its minimum point under the Mamdani model, so we need to linearize f (X) and use the nonlinear conjugate gradient algorithm to optimize the parameters of f (X).

If we suppose that f (X) = [Ax - b] T [Ax - b], then the gradient is ∇xf (x) =2AT (Ax - b), and the objective is to find x subject to ∇xf (x) =0. The nonlinear conjugate gradient requires f being twice differentiable, but as f is a Gaussian function, it is infinitely differentiable. Starting from the opposite direction as Δx0 = - ∇ xf (x0) with step size α, we have that:

(34)

α0=argminαf(x0+αΔx0)

(35)

x1=x0+α0Δx0

This is the first iteration in the direction of Δx0, and by setting the initial conjugate direction s0 = Δx0, the following steps will calculate Δxn:

Step 1: Calculate Δxn = - ∇ xf (xn).

Step 2: Calculate βn:βn=ΔxnT(Δxn-Δxn-1)Δxn-1TΔxn-1. (Polak–Ribière)

Step 3: Update the conjugate direction sn = Δxn + βnsn-1.

Step 4: Calculate αn=argminαf(xn+αsn).

Step 5: Update xn+1 = xn + αnsn.

The algorithm is based on the quadratic function that we use to normalize the Gaussian function f (x) in order to speed up the iterations. Considering a simplified Mamdani model and from formula (20), we know that:

(36)

M(x,y)=12πe-12(Ax2+By2+Cx+Dy+E)

Using the nonlinear conjugate gradient, we obtain the results given in Fig. 1 and Table 1 by comparing with other special functions. From Table 1, we know that the Gaussian density function will be approximated in just a few steps by the nonlinear conjugate gradient algorithm, which is the reason we selected the Gaussian distribution as the density function of this special structure. We also compare other forms of density function, which appear to require more steps under the nonlinear conjugate gradient algorithm.

4.2Takagi–Sugeno model in RGS-FIS

Takagi and Sugeno [18] proposed a fuzzy IF-THEN rules system as the local input–output relations of a nonlinear system to scale the population of rules under a multi-dimensional fuzzy inference system, known as the T-S model [21]. The normal rules for the T-S model under the special information structure proposed for our information fusion method are:

RT-S:IFINPUT-isI1,INPUT-isI2,⋯,INPUT-nnisInTHENIf=f(I1,I2,⋯Im).

The T-S model outputs a linear, non-constant function that will reduce the population of rules.

From rule set RT-S, we can simplify If=∑i=1naiδi+bidi, in which ai and bi are undetermined constants. Let the standard deviation in Equation (6)σ=I→, μ = 0, and thus, If=12π∑i=1naiei-x22+bix. The first part of If is a GMM model that can be estimated by Section 2.2-(2), and the second part of If is a linear function (see Fig. 2).

Furthermore, for the nonlinear conjugate gradient proposed in Section 4.1, we obtain 100 steps and 301 gradients to find the minimum point (the Mamdani model). As a result, we can simplify this in RGS-FIS under the T-S model to output three linear membership functions. Suppose that the inputs are Gaussian-shaped rules, and the outputs are linear functions. Let the membership function of INPUT 1 and INPUT 2 be a Gaussian function, and the OUTPUT is composed of three linear functions [33]. We have this RGS-FIS system under the T-S model (see Fig. 3).

5Concluding remarks and future works

This paper proposed a novel information structure applicable to a Gaussian-shaped FIS. We developed the RGS-FIS approach using the nonlinear conjugate gradient algorithm and a T-S model. However, there are two problems with RGS-FIS: one is that new fusion operator parameters depend on a complex estimation process, and the other is that all data variables are supposed to be independent (r = 0). The model selection for similarity computing under rule-based fuzzy implication operations should also be improved.

Future work will focus on the pre-processing of datasets as well as the estimation of model parameters. Pre-processing will tune the parameters of the model to display a simpler mathematical presentation and assure a robust inference process. Furthermore, the fusion operator needs to be improved so that it does not solely depend on fuzzy implications. Although similarity computing is the key factor for calculating the possibility of IF-THEN rules, it is not clear whether a feasible algorithm can be developed for this. Hence, the possibility of the IF-THEN rules also needs to be calculated and improved.

Acknowledgments

The authors would like to thank the editors, the anonymous reviewers, and Dr. Thayer El-Dajjani for their most constructive comments and suggestions to improve the quality of this paper. This work is supported by Zhejiang Provincial Natural Science Fund under No. (LY13H180012)