This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

To overcome the performance degradation in the presence of steering vector mismatches, strict restrictions on the number of available snapshots, and numerous interferences, a novel beamforming approach based on nonlinear least-square support vector regression machine (LS-SVR) is derived in this paper. In this approach, the conventional linearly constrained minimum variance cost function used by minimum variance distortionless response (MVDR) beamformer is replaced by a squared-loss function to increase robustness in complex scenarios and provide additional control over the sidelobe level. Gaussian kernels are also used to obtain better generalization capacity. This novel approach has two highlights, one is a recursive regression procedure to estimate the weight vectors on real-time, the other is a sparse model with novelty criterion to reduce the final size of the beamformer. The analysis and simulation tests show that the proposed approach offers better noise suppression capability and achieve near optimal signal-to-interference-and-noise ratio (SINR) with a low computational burden, as compared to other recently proposed robust beamforming techniques.

As one important branch of modern array signal processing, the beamforming technique has been widely studied and applied in the radar, wireless communication, sonar, medical imaging, as well as astronomy domains. The standard beamforming approach, such as the minimum variance distortionless response (MVDR) beamformer [1], was usually established based on an ideal antenna array with exactly known array manifold. Thus, it is very sensitive to practical circumstances, and its performance would be seriously degraded by diverse factors, such as the steering vector mismatch, array calibration errors and snapshot number restrictions.

During the last decades, in order to resist the model mismatches and possible environment changes, the robust beamforming approach have been largely studied [2–5]. Among others, by introducing a penalty term into the objective function, the diagonal loading (DL) algorithm could effectively reduce the eigenvalue spread of the noise and prevent the distortion of beampattern [6]. Nevertheless, how to get the optimal loading factor for DL is still a serious issue when the desired steering vector and/or the available snapshot numbers are uncertain [7]. A robust adaptive beamforming, based on the worst-case performance optimization, would delimit the uncertainty set of steering vectors by upper bounding the norm of the steering vector mismatch [8]. However, neither the mismatch vector nor its upper bound is known in practice. To overcome this model defect in standard DL algorithm, an adaptive beamforming method was developed, which estimates iteratively the difference between the actual and presumed steering vectors in order to maximize the output signal-to-noise plus interference ratio (SINR) [9–11]. But this adaptive beamforming algorithm is not sufficiently reliable in the case when the snapshots are small.

In order to reject jamming signals, poor array calibration, signal wave-front distortions, the minimum-variance-distortionless-response (MVDR) beamforming is modified by the means of incorporating multiple linear constrains [12–14]. Whereas, the augmentation of constrains would reduce the array freedom degrees in the linear beamforming framework. Nonlinear beamforming approaches provide a novel idea to address this issue for they can adapt better to the statistical properties of the given data than linear ones [15]. Neural network has been applied to beamforming among other nonlinear array processing tasks. But this approach suffers from serious drawbacks such as over-fitting or local minima, which leads to suboptimal solutions [16].

Support Vector Machines (SVM), introduced by Vapnik [17], is an important new methodology for pattern classification and nonlinear function approximation. This method addresses the beamforming problem by means of incorporating additional inequality constrains to penalize sidelobe levels and allowing a certain error in the desired signal direction [18]. Thus the MVDR beamforming method is reformulated and the cost function turns out to be equivalent to SVM for regression. However, the time consumed to train SVM beamformer scales super linearly to the number of observations, and it leads to an insurmountable computational burden in online operation modes [19]. The least-squares support vector machine (LS-SVM) inherits the SVM's generalization capacity. By solving linear equations instead of a quadratic programming (QP) problem in the standard SVM, the training procedure and the computational complexity of the standard SVM would be effectively simplified [20]. The main drawback of LS-SVM is that it works in batch mode. Thus, it is difficult to be used in large-scale applications. Recent researches about LS-SVM continuously focus on the improvement of the training algorithms, model selection and sparseness [21,22].

This paper presents a new LS-SVR-based approach to address the robust beamforming issue. This approach alleviates the array output SINR degradation in the presence of steering vector mismatches, strict restrictions on the number of available snapshots, and numerous interferences by replacing the conventional linearly constrained minimum variance cost function with a squared-loss function, and achieves better generalization capacity by applying Gaussian kernels to the array observations. We also present a fast recursive procedure to estimate the weight vectors on real-time, and a novelty criterion to perform model reduction. The paper is organized as follows. The signal model, also the minimum mean square error (MMSE) and the MVDR-beamformer solutions are presented in Section 2. The basic principle of LS-SVR-based beamforming method is introduced in Section 3. In Section 4, a recursive procedure to calculate the regression parameters is provided. And a sparse mode is presented in Section 5. The simulation tests under different mismatch scenarios are illustrated in Section 6. A summary conclusion is given at the last of this paper.

2.Sensor Signal Model

Consider a linear array of M sensors receives signals from D narrowband source. The vector of array observations x(t) ∈ CM×1 at time t could be modeled as:
(1)x(t)=As(t)+n(t)where, θ = [θ1, θ2,…, θD]T ∈ RD×1 is the vector with the directions of arrival (DOA) and (·)T stands for transpose, A = [a(θ1), a(θ2)…a(θD)] ∈ CM×D is the matrix containing the array steering vectors a (θi) = [1,e−j2πsin(θi)d/λ…e−j2πsin(θi)d/λ]T. The uncorrelated sources are represented by the vector s(t) = [s1(k), s2(k))…sD(k)]T ∈ CM×1. The vector n(t) ∈ CM×1 is the sensor noise, and it is assumed as complex Gaussian with zero-mean:
(2)A=[a(θ1),a(θ2)⋯a(θD)]

If certain observations are known during the procedure of training parameters, then, according to the MMSE criterion, the complex vector of beamformer weights w can be described as:
(4)w=R−1pwhere, R is M × M covariance matrix, and p is the cross-correlation between the desired output and the received signal.

The classical MVDR beamformer minimizes the array output energy, and the weights subject to a constraint of unity array response on the desired array steering vectors, that is:
(5)minwwHRws.t.wHa(θ1)=1

The constraint wHa(θ1) = 1 prevents the gain at the look direction from being reduced, and the solution of Equation (5) can be easily estimated by means of using Largrange multiplier method:
(6)w=R−1a(θ1)a(θ1)HR−1a(θ1)

In practice, it is not feasible to calculate the exact covariance matrix R and it would be estimated by the sample covariance matrix
R^=1K∑k=1Kx(k)xH(k) where K is the number of observed snapshots.

The performance of MVDR beamformer in Equation (5) is sensitive to mismatch between the presumed and actual steering vectors due to the uncertainty of the desired signal DOA, strict restrictions on the number of available snapshots, and numerous interferences.

3.LS-SVR-Based Beamforming Method3.1.Nonlinear SVM-Based Beamforming

Consider a set of snapshots xi, i = 1, N at time t from an array and the corresponding set of desired symbols yi, i = 1, N, are available for training purpose. The basic idea of nonlinear beamforming is to transform the data set xi, i = 1, N into a higher (possibly infinite) dimension feature space H by a nonlinear transformation φ(·). Thus, the beamformer's output can be formulated as a linear regression in H. It could be expressed as:
(7)yi=wHϕ(xi)+eiwhere, w ∈ H is the linear parameter set and ei is the output error.

The parameter set w can be estimated by minimizing a certain cost function on output error ei. For SVM regression, the parameter set w and the ε–intensive loss function could be estimated by the minimum risk criterion, i.e.,
(8)minJ(w,ɛ)=12‖w‖2+C∑i=1NLɛ(ξi,ζi)subject to ξn, ξn ≥ 0. Where, C ≥ 0 is the tradeoff term between the minimization of the weight norm and the output error. The ε–intensive loss function is given by:
(9)L(e)={0,|e|<ɛ|e|−ɛ,|e|≥ɛwhere ε is a positive parameter which is used as an error threshold.

The weight vector w is regularized by solving Equation (8), Thus, the generalization capacity of the beamformer will be remarkably improved.

3.2.Nonlinear LS-SVR Beamforming

Instead of the inequality constrains in standard SVM algorithm, the equality ones are taken in LS-SVR, and the linear equation of the ε–intensive loss function is replaced by a quadratic equation. Therefore, The LS-SVR beamformer can be described as the following quadratic optimization problem [20]:
(10)minJ(w¯t,e¯t)=12‖w¯t‖2+C∑i=1Nei,t2s.t.ei,t=y¯i,t−w¯tTϕ(x¯i,t)−bt,i=1,2,⋯,Nwhere,
eti is the error at time t. The sum of squared errors in Equation (10) represents the ε–intensive loss function under the linear constraint. This treatment would greatly reduce the computation complexity since only the linear equation, instead of the QP problem in SVM, is solved.

The array observations of the beamformer are complex, whereas the variables in the objective function of SVM are real. So, it is necessary to rewrite the complex variables as real variables. For this reason, the array observations xi, the beamformer outputs yi and the weight vectors wt are rewritten as:
(11)x¯i,j={[Re(xi,tT)Im(xi,tT)]T∈R2M,i=1,⋯,N[Im(xi−t,tT)−Re(xi−t,tT)]T∈R2M,i=t+1,⋯,2Nw¯t=[Re(wtT)Im(wtT)]T∈R2My¯i,j={Re(yi,t),i=1,⋯,NIm(yi−t,t),i=t+1,⋯,2N

The result of the quadratic optimization problem of Equation (10) is the saddle point of the following Lagrange function:
(12)L(x¯i,t,bt,e¯t,αt)=J(w¯t,et)−∑i=12Nαi,t{w¯tTϕ(x¯i,t)+bt+ei,t−y¯i,t}where, αt = (α1, α2, … α2N)T, αi > 0 is Lagrange multipliers, defined as regression parameters in this paper.

According to the Karush-Kuhn-Tucker (KKT) conditions, differentiating the above function with respect to the Lagrange multipliers αt and x̄i,t, bt, ei,t yields:
(13){∂L∂w¯t=0⇒w¯t=∑i=12Nαiϕ(x¯i,t)∂L∂bt=0⇒∑i=12Nαi=0∂L∂e¯i,t=0⇒αi=Cei,t∂L∂αi,t=0⇒w¯tTϕ(x¯i,t)+bt+ei,t−yi,t

The system obtained from the KKT conditions is linear. Its result is obtained by solving the linear system which is expressed as following matrix:
(14)[0e¯Te¯Qt+C−1I][btαt]=[0y¯t]where, ȳt = (ȳ1, ȳ2 … ȳ2N)T, ē = (1,1,…,1;T, Qi is Gramm matrix and the element of Qi is Qi,j =< φ(x̄i,j), φ(x̄i,j) >= kt(x̄i, x̄j), i,j = 1,2,…2N, kt(x̄i, x̄j) denotes kernel function responsible for the nonlinear mapping φ(&moddot;), which greatly simplify the inner product calculation in the feature space. Thus, linear methods can be applied on the transformed data, and it is not necessary to perform computations in the high-dimensional feature space. As the most widely used kernel function in many practical applications, Gaussian kernel is taken here:
(15)k(x¯i,x¯j)=exp(−‖x¯i−x¯j‖22σ2)where σ > 0 is the kernel radius.

The outputs of the nonlinear LS-SVR beamformer are:
(16)y¯t+1=∑i=12Nαik(x¯t+1,x¯i,t)+bt

4.Recursive Algorithms

From Equation (16), it could be known that once the regression parameters αt and bt are computed, the beamformer outputs can be obtained. Denoting Ut = Ht−1 = (Qt + C−1I)−1, the result of LS-SVR (Equation (14)) can be represented as:
(17)[0e¯Te¯Ut−1][btαt]=[0y¯t]

Then, we have:
(18)bt=eTUtyteTUteαt=Ut(yt−eeTUtyteTUte)

As the number of snapshots increases, the dimension of Gramm matrix Qi will be increasing because it is in proportional to the number of snapshots. Therefore, the computation for the regression parameters αt and bt would be very intensive as the snapshots increase, and it is key issue for LS-SVR beamformer to find out a fast algorithm to improve the computation efficiency of Ui.

At time step t, Qi and Hi are the matrixes with dimension of 2N × 2N:
(19)Qt=(k(x¯1,x¯1)⋯k(x¯t,x¯1)⋮⋮⋮k(x¯1,x¯t)⋯k(x¯t,x¯t))Ht=(Qt+C−1I)=(k(x¯1,x¯1)+1/C…k(x¯2N,x¯1)⋮⋱⋮k(x¯1,x¯2N)⋯k(x¯2N,x¯2N)+1/C)

As time run to t + 1, new input snapshots xt+1 and the corresponding desired array output yt+1 are added to the current training set. So Qt+1 and Ht+1 can be represented as:
(20)Qt+1=(k(x¯1,x¯1)…k(x¯2N,x¯1)k(x¯2(N+1),x¯1)⋮⋱⋮⋮k(x¯1,x¯2N)⋯k(x¯2N,x¯2N)k(x¯2(N+1),x¯1)k(x¯1,x¯2(N+1))⋯k(x¯2N,x¯2(N+1))k(x¯2(N+1),x¯2(N+1)))Ht+1=Qt+1+C−1I=(k(x¯1,x¯1)+1/C⋯k(x¯2N,x¯1)k(x¯2(N+1),x¯1)⋮⋱⋮⋮k(x¯1,x¯2N)⋯k(x¯2N,x¯2N)+1/Ck(x¯2(N+1),x¯1)k(x¯1,x¯2(N+1))⋯k(x¯2N,x¯2(N+1))k(x¯2(N+1),x¯2(N+1))+1/C)

According to the theorem of inverting block matrix, the inverse of Ht+1 can be expressed by the inverse of Ht and the new column vt+1 as:
(22)Ht+1−1=(Ht−1+βHt−1vt+1vt+1THt−1−βHt−1vt+1−βvt+1THt−1β)where,
β=(vt+1−vt+1THt−1vt+1). Thus the inverse of Ht+1,which is equal to Ut+1, can be calculated from the inverse of Ht, and it is not necessary to calculate the inverse of Ht when it has high dimension, so the computation complexity would be greatly reduced and the numerical stability problem arising from inverse matrix would be also avoided. When the set of snapshots is small, the Ut can be computed directly by matrix inverse theory.

5.Sparsification

The crucial drawback of LS-SVR beamformer is that it deals with high-dimension matrix, which is equal to the number of the snapshots due to the use of a quadratic constraint function. This would bring a big implementation problem to the proposed beamforming method since it is required to increase memory and computational resources as time evolves. Several methods have been proposed to cope with these problems [23,24]. The sliding-window approach [25] fixes the size of LS-SVR beamformer and allows it to be operated online in time-varying environments by keeping only the last N input snapshots in the sliding-window and simply abandoning those out of it. In [26], an exponential forgetting mechanism is introduced to describe the influence, which is imposed on the present situation by the past data [26]. This paper employs the novelty criterion, presented by Platt [27,28], to reduce the final size of the proposed beamformer, keep the algorithm complexity bounded and realize online sparsification. The basic idea of this approach is to construct a dictionary with center set C and update it appropriately according to the novelty criterion. The stages of the proposed specification are given as follows:

Step 1: Initialing an empty center set C0;

Step 2: Calculating the distance between the new snapshot xt and the present dictionary dis=minck∈ Ci ‖xt − ck‖;

Step 3: If the distance obtained from Step 2 is smaller than the preset threshold δ1, xt is not added into the dictionary, otherwise the prediction error ei = yi − ŷi is calculated;

Step 4: if |e|i is larger than another preset threshold δ2, xt is accepted as a new center and Ci is updated to Ci+1, otherwise go to Step 2.

Increasing δ1 and δ2, the final size of the LS-SVR beamformer will be decreased. But this will result to performance degradation. In practical applications, δ1 is set to around one tenth of the kernel bandwidth, and δ2 is around the square root of the steady-state mean square error (MSE). Cross-validation also can be used to select these appropriate thresholds.

Applying the above sparsification procedure, the computation complexity of the proposed beamformer will be reduced from O(N2) to O(K2), where K is the effective number of centers in the network at time t. As K is finite, the online real-time beamforming will be practical.

6.Simulation Tests

To evaluate the performance of the proposed LS-SVR-based beamformer, simulation tests are carried out. A 10 elements uniform linear array with half-wavelength spacing is taken into account. The desired signal comes from a presumed direction θ = 3° and two irrelevant interferences, with interference-to-noise ratio (INR) of 20 dB, impinge on the array from θ2 = −32° and θ3 = 17° respectively. The additive noise is assumed to be a 0-dB complex white Gaussian distributed random variable. For comparison purpose, the conventional MVDR, the diagonal loading MVDR (MVDR-DL), the ES [29], the SQP [9] and the RR [30] method are considered. The parameters of the proposed beamformer, σ, δ1 and δ2, are chosen as 1.0, 0.1 and 0.08 respectively. The load value of MVDR-DL beamformer is set to (Pe+10 dB), where Pe denotes the power of desired signal. All results are obtained from 100 independent simulation runs.

The first simulation aims to compare the performance of these beamformers when steering vector mismatch is presented. From Figure 1(a), we observe that the proposed LS-SVR beamformer consistently improves its output SINR as SNR increases and performs much closer as the idea one when the input SNR is varied from −20 dB to 30 dB. Due to the DOA mismatch, the interested signal is considered as interference and a null is allocated in the desired signal direction by the MVDR beamformer. As a result, the output SINR is decreased. When input SNR is larger than −5 dB, the output SINR of MVDR beamformer degrades seriously. In comparing with the MVDR beamformer, the MVDR-DL, ES, SQP and RR methods get more robustness against DOA mismatch. But they still suffer from a degradation of performance while the input SNR becomes higher.

Figure 1(b) shows the normali\zed beampattern plots when the input SNR is equal to 10 dB. As it is illustrated, all beam-patterns of the robust beamformers have nulls at the DOAs of the interferences. But the proposed LS-SVR still outperforms others by markedly lower sidelobe level, and maintaining distortionless response for the desired signal.

The covariance matrix would be inaccurately estimated owing to insufficient snapshots, DOA mismatch of desired signal and array calibration errors. This kind of inaccuracy may result in the degradation of array response. Hence, both the errors of insufficient snapshots and DOA mismatch are considered to verify the proposed beamformer in our second simulation tests. Figure 2 shows the resulting output SINRs versus the snapshot number K. When snapshots are over 20, the LS-SVR clearly outperforms other beamformers tested. Owing to the steering vector mismatch, the MVDR beamformer see the desired signal as interference and fails in its operation.

The performance of the proposed beamformer in the scenario with multiple interferences is demonstrated in the third test. The steering vector mismatch is also presented. As it can be seen from Figure 3(a), the proposed algorithm performs equally well as ES and SQP when the number of interferences less than 5. When the interference numbers is increased to 8, the output SINR of the proposed LS-SVR beamformer is only 1 dB lower than that of idea beamformer. In contrast, the output SINRs of other beamformers tested are dramatically decreased due to the decrease of the available freedom degrees which are devoted to suppress the interference.

The corresponding beampatterns of the beamformers are demonstrated in Figure 3(b), where the four interferences with DOAs of θi = [17.4°, −11.5°, 53.1°, −23.5°] are taken into account. It can be seen that the LS-SVR beamformer not only presents deep nulls at the DOAs of interference, but also achieves better sidelobe suppression than other beamformers tested. Thus, the proposed LS-SVR method can get better SINR performance than the usual robust linear beamforming algorithms in the case of numerous interferences.

To show the computation complexity of the novel approach, the dictionary size growth with the input samples is given in Figure 4. As it can be seen in Figure 4, only 396 center numbers are needed to calculate the beamformed output for 4,000 input samples. In comparison with the original LS-SVR algorithm, in which 4,000 centers are needed for the same case. Thus, the computation cost is largely reduced.

7.Conclusions

We present a novel nonlinear LS-SVR-based beamforming approach in this paper. This approach first uses a squared-loss function to replace the conventional linearly constrained minimum variance cost function, which can significantly increase robustness against mismatch problems and provide additional control over the sidelobe level. The method also applies Gaussian kernels to the array observations to improve the generalization capacity. Finally, the method uses a recursive regression procedure to estimate the weight vectors on real-time and performs mode reduction to reduce the final size of the beamformer.

The simulation tests, with steering vector mismatch, numerous interferences and limited available snapshots, are carried out to verify the performance of the proposed beamforming algorithm in comparison with other recently proposed ones. The test results show that the proposed beamforming method significantly outperforms many other recently proposed linear robust beamforming techniques in terms of signal distortion in the desired signal and noise reduction in scenarios with DOA mismatch, limited observation samples, and numerous interferences.

This research was supported by the National Natural Science Foundation of China (Grant No.61071191) and Natural Science Foundation of Chongqing (CSTC 2011BB2048).