A Hybrid Conjugate Gradient Algorithm for Unconstrained Optimization as a Convex Combination of Hestenes-Stiefel and Dai-Yuan

Abstract: In this paper we propose and analyze another hybrid conjugate gradient algorithm in which the parameter is computed as a convex combination of (Hestenes-Stiefel) and (Dai-Yuan), i.e. . The parameter in the convex combination is computed in such a way that the direction corresponding to the conjugate gradient algorithm is the Newton direction and the secant equation is satisfied. The algorithm uses the standard Wolfe line search conditions. Numerical comparisons with conjugate gradient algorithms using a set of 750 unconstrained optimization problems, some of them from the CUTE library, show that this hybrid computational scheme outperforms the Hestenes-Stiefel and the Dai-Yuan conjugate gradient algorithms as well as some other known hybrid conjugate gradient algorithms. Comparisons with CG_DESCENT by Hager and Zhang [17] and LBFGS by Liu and Nocedal [22] show that CG_DESCENT is more robust then our algorithm, and LBFGS is top performer among these algorithms.

In this paper let us consider the nonlinear unconstrained optimization problem

(1)

where is a continuously differentiable function, bounded from below. As we know, for solving this problem, starting from an initial guess , a nonlinear conjugate gradient method, generates a sequence as

, (2)

where is obtained by line search, and the directions are generated as

, . (3)

In (3) is known as the conjugate gradient parameter, and . Consider the Euclidean norm and define . The line search in the conjugate gradient algorithms often is based on the standard Wolfe conditions:

(4)

, (5)

where is a descent direction and Plenty of conjugate gradient methods are known, and an excellent survey of these methods, with a special attention on their global convergence, is given by Hager and Zhang [18]. Different conjugate gradient algorithms correspond to different choices for the scalar parameter The methods of Fletcher and Reeves (FR) [15], of Dai and Yuan (DY) [11] and the Conjugate Descent (CD) proposed by Fletcher [14]:

, ,

have strong convergence properties, but they may have modest practical performance due to jamming. On the other hand, the methods of Polak – Ribière [23] and Polyak (PRP) [24], of Hestenes and Stiefel (HS) [19] or of Liu and Storey (LS) [21]:

in general may not be convergent, but they often have better computational performances.

In this paper we focus on hybrid conjugate gradient methods. These methods are combinations of different conjugate gradient algorithms, mainly they being proposed to avoid the jamming phenomenon and to improve the performances of the above conjugate gradient algorithms. One of the first hybrid conjugate gradient algorithms has been introduced by Touati-Ahmed and Storey [27], where the parameter is computed as:

The PRP method has a built-in restart feature that directly addresses to jamming. Indeed, when the step is small, then the factor in the numerator of tends to zero. Therefore, becomes small and the search direction is very close to the steepest descent direction Hence, when the iterations jam, the method of Touati-Ahmed and Storey uses the PRP computational scheme.

Another hybrid conjugate gradient method was given by Hu and Storey [20], where in (3) is:

.

As above, when the method of Hu and Storey is jamming, then the PRP method is used instead.

The combination between LS and CD conjugate gradient methods leads to the following hybrid method:

.

The CD method of Fletcher [14] is very close to FR method. With an exact line search, CD method is identical to FR. Similarly, for an exact line search, LS method is also identical to PRP. Therefore, the hybrid LS-CD method with an exact line search has similar performances with the hybrid method of Hu and Storey.

Gilbert and Nocedal [16] suggested a combination between PRP and FR methods as:

.

Since is always nonnegative, it follows that can be negative. The method of Gilbert and Nocedal has the same advantage of avoiding jamming.

Using the standard Wolfe line search, the DY method always generates descent directions and if the gradient is Lipschitz continuously the method is global convergent. In an effort to improve their algorithm, Dai and Yuan [12] combined in a projective manner their algorithm with that of Hestenes and Stiefel, thus proposing the following two hybrid methods:

,

,

where . For the standard Wolfe conditions (4) and (5), under the Lipschitz continuity of the gradient, Dai and Yuan [12] established the global convergence of these hybrid computational schemes.

In contrast to the hybrid methods and in this paper we propose another hybrid conjugate gradient where the parameter is computed as a convex combination of and . We selected these two methods to combine in a hybrid conjugate gradient algorithm because HS has good computational properties, on one side, and DY has strong convergence properties, on the other side. HS method automatically adjust to avoid jamming, often this method performs better in practice than DY and we use this in order to have a good practical conjugate gradient algorithm. The structure of the paper is as follows. In section 2 we introduce our hybrid conjugate gradient algorithm and prove that it generates descent directions satisfying in some conditions the sufficient descent condition. Section 3 presents the algorithm and in section 4 we show its convergence analysis. In section 5 some numerical experiments and performance profiles of Dolan-Moré [13] corresponding to this new hybrid conjugate gradient algorithm versus some other conjugate gradient algorithms are presented. The performance profiles corresponding to a set of 750 unconstrained optimization problems in the CUTE test problem library [7], as well as some other unconstrained optimization problems presented in [1] show that this hybrid conjugate gradient algorithm outperforms the known hybrid conjugate gradient algorithms. However, the comparisons between our algorithm and CG_DESCENT by Hager and Zhang [17] show that CG_DESCENT is more robust. On the other hand, comparisons with LBFGS by Liu and Nocedal [22] show that limited memory LBFGS is top performer.