Modification of the Armijo line search to satisfy the convergence properties of HS method

The Hestenes-Stiefel (HS) conjugate gradient algorithm is a useful tool of unconstrained numerical optimization, which has good numerical performance but no global convergence result under traditional line searches. This paper proposes a line search technique that guarantee the global convergence of the Hestenes-Stiefel (HS) conjugate gradient method. Numerical tests are presented to validate the different approaches.


Introduction
Consider the following unconstrained optimization problem: where f is continuously differentiable and its gradient g (x) = ∇f (x) is available.Iterative methods are widely used for solving (1) and the iterative formula is given by where x k ∈ R n is the k th approximation to the solution, α k is a steplength obtained by carrying out a line search, and d k is a search direction.
There are many kinds of iterative methods that include the Newton method, the steepest descent method and nonlinear conjugate gradient methods, for example.The conjugate gradient methods are the most famous methods for solving unstrained optimization (1), especially in case of large scale optimization problems in scientific and engineering computation due to the simplicity of their iteration and low memory requirements.The search direction d k is defined by where β k is a scalar and g k = g (x k ).The original nonlinear conjugate gradient method proposed by Hestenes and Stiefel (HS conjugate gradient method) [15], in which β k is defined by There are at least six formulas for β k , which are given below: Fletcher-Reeves [9] (5) Mohammed Belloufi.Email: mbelloufi doc@yahoo.fr.
Zoutendijk [31] proved that the FR method with exact line search is globally convergent.Al-Baali [2] extended this result to the strong Wolfe-Powell line search.Powell [21] proved that the sequence of gradient norms g k could be bounded away from zero only when So one can prove that the FR method is globally convergent for general functions by using (10).However, the global convergence has not been established for the PRP method with the strong Wolfe-Powell line search conditions.In fact, Powell proved that even if the steplength was chosen to be the least positive minimizer of the one variable function (Φ k (α) = f (x k + αd k ), α ∈ R), the PRP method could cycle infinitely without approaching a solution.
Some convergent versions were proposed by using some new complicated line searches or through restricting the parameter β k to a nonnegative number [12,13,25,26,27].The CD method was proved to have global convergence property under strong Wolfe line search with a strong restriction on the parameters [5] and DY method has global convergence under weak Wolfe line search [6].Some impressive literature on conjugate gradient methods can be found in [4,5,7,10,11,16,17,22,23,30].
However, to the best of our knowledge, the global convergence of the original LS and HS methods has not been proved under all the mentioned line searches.In this paper, we propose a new line search procedure and investigate the global convergence of the original HS method.
Under the sufficient descent condition for some constant c ∈ ]0, 1[ Once the descent direction d k is determined at the k-th iteration, we should seek a step size along the descent direction and complete one iteration.
There are many approaches for finding an available step size.Among them the exact line search is an ideal one, but is cost-consuming or even impossible to use to find the step size.Some inexact line searches are sometimes useful and effective in practical computation, such as Armijo line search [1], Goldstein and Wolfe line search [8,14,28,29].The Armijo line search is commonly used and easy to implement in practical computation.

Armijo line search
Let s > 0 be a constant, ρ ∈ (0, 1) and µ ∈ (0, 1).Choose α k to be the largest α in {s, sρ, sρ 2 , ..., } such that The drawback of the Armijo line search is how to choose the initial step size s.If s is too large then the procedure needs to call much more function evaluations.If s is too small then the efficiency of related algorithm will be decreased.Thereby, we should choose an adequate initial step size s at each iteration so as to find the step size α k easily.
In this paper we propose a new Armijomodified line search in which an appropriate initial step size s is defined and varies at each iteration.
The new Armijo-modified line search enables us to find the step size α k easily at each iteration and guarantees the global convergence of the original HS conjugate gradient method under some mild conditions.
The global convergence and linear convergence rate are analyzed and numerical results show that HS method with the new Armijo-modified line search is more effective, than other similar methods in solving large scale minimization problems.

New Armijo-Modified Line Search
We first assume that Asumption A. The objective function f (x) is continuously differentiable and has a lower bound on R n Asumption B. The gradient g , there exists an and α k is the largest α in {s, sρsρ 2 , ..., } such that

Algorithm and Convergent Properties
In this section, we will reintroduce the convergence properties of the HS method Now we give the following algorithm firstly.
Step 1: If g k = 0 then stop else go to Step 2.
Step 2: Set and α k is defined by the new Armijo-modified line search.
Step 3: Setk := k + 1 and go to Step 1.Some simple properties of the above algorithm are given as follows.

Lemma 1. Assume that (A) and (B) hold and the HS method with the new Armijo-modified line search generates an infinite sequence {x
Proof.By the condition (B), the Cauchy-Schwartz inequality and the HS method, we have The proof is finished.

Lemma 2. Assume that (A) and (B) hold. Then the new Armijo-modified line search is well defined.
Proof.On the one hand, since On the other hand, by Lemma 1, we can obtain We can prove that the new Armijo-modified line search is well defined when α ∈ [0, α k ].The proof is completed.

Global Convergence
Lemma 3. Assume that (A) and (B) hold and the HS method with the new Armijo-modified line search generates an infinite sequence {x k } and there exist m 0 > 0 and M 0 > 0 such that m 0 ≤ L k ≤ M 0 .Then, Proof.For k = 0 we have For k > 0, by the procedure of the new Armijomodified line search, we have By the Cauchy-Schwartz inequality, the above inequality and noting the HS formula, we have Theorem 1. Assume that (A) and (B) hold, the HS method with the new Armijo-modified line search generates an infinite sequence {x k } and there exist m 0 > 0 and Proof.Let η 0 = inf ∀k {α k }.
If η 0 > 0 then we have By (A) we have and thus, lim k→∞ g k = 0.
In the following, we will prove that η 0 > 0. For the contrary, assume that η 0 = 0.Then, there exists an infinite subset K ⊆ {0, 1, 2, ..., } such that lim k∈K,k→∞ By Lemma 3 we obtain Let α = α k /ρ , at least one of the following two inequalities and ) does not hold.If (17) does not hold, then we have Using the mean value theorem on the left-hand side of the above inequality, there exists By (B), the Cauchy-Schwartz inequality, (19) and Lemma 1, we have We can obtain from Lemma 3 that which contradicts (16).If (18) does not hold, then we have and thus, By using the Cauchy-Schwartz inequality on the left-hand side of the above inequality we have Combining Lemma 3 we have which also contradicts (16).This shows that η 0 > 0. The whole proof is completed.

Linear Convergence Rate
In this section we shall prove that the HS method with the new Armijo-modified line search has linear convergence rate under some mild conditions.We further assume that Asumption C. The sequence {x k } generated by the HS method with the new Armijo-type line search converges to x * , ∇ 2 f (x * ) is a symmetric positive definite matrix and f (x) is twice continuously differentiable on and thus By ( 23) and ( 22) we can also obtain, from the Cauchy-Schwartz inequality, that and Proof.Its proof can be seen from the literature (e.g.[29]).
Theorem 2. Assume that Asumption (C) holds, the HS method with the new Armijo-type line search generates an infinite sequence {x k } and there exist m ′ > 0 and M ′ > 0 such that 0 ≤ L k ≤ M 0 .Then {x k } converges to x * at least R-linearly.
Proof.Its proof can be seen from the literature (e.g.[26]).

Numerical Reports
In this section, we shall conduct some numerical experiments to show the efficiency of the new Armijo-modified line search used in the HS method.
The Lipschitz constant L of g(x) is usually not a known priori in practical computation and needs to be estimated.In the sequel, we shall discuss the problem and present some approaches for estimating L. In a recent paper [24], some approaches for estimating L were proposed.If k ≥ 1 then we set and obtain the following three estimating formula In fact, if L is a Lipschitz constant then any L ′ greater than L is also a Lipschitz constant, which allows us to find a large Lipschitz constant.However, a very large Lipschitz constant possibly leads to a very small step size and makes the HS method with the new Armijo-modified line search converge very slowly.Thereby, we should seek as small as possible Lipschitz constants in practical computation.
In the k-th iteration we take the Lipschitz constants as respectively with L 0 > 0 and M ′ 0 being a large positive number.
Lemma 5. Assume that (H1) and (H2) hold, the HS method with the new Armijo-modified line search generates an infinite sequence {x k } and L k is evaluated by (34), ( 35) or (36).Then, there exist m 0 > 0 and M 0 > 0 such that Proof.Obviously, L k = L 0 , and we can take m 0 = L 0 .For (34) we have For (35) we have For (36), by using the Cauchy-Schwartz inequality, we have ≤ max (L 0 , L) .Birgin and Martinez developed a family of scaled conjugate gradient algorithms, called the spectral conjugate gradient method (abbreviated as SCG) [3].Numerical experiments showed that some special SCG methods were effective.In one SCG method, the initial choice of α at the k−th iteration in SCG method was  [18] to implement the HS method with the new Armijo-modified line search.We set the parameters as µ = 0.25, ρ = 0.75, c = 0.75 and L 0 = 1 in the numerical experiment.five conjugate gradient algorithms (HS1, HS2, HS3, HS and HS+) are compared in numerical performance.

By letting M
The stop criterion is g k ≤ 10 −8 , and the numerical results are given in Table 1.
In Table 1, CPU denotes the total CPU time (seconds) for solving all the 15 test problems.A pair of numbers means the number of iterations and the number of functional evaluations.It can be seen from Table 1 that the HS method with the new Armijo-modified line search is effective for solving some large scale problems.In particular, method HS1 seems to be the best one among the five algorithms because it uses the least number of iterations and functional evaluations when the algorithms reach the same precision.This shows that the estimating formula (34) may be more reasonable than other formula.In fact, if This motivates us to guess that the suitable Lipschitz constant should be chosen in the interval It can be seen from Table 1 that HS methods with the new line search are superior to HS and PRP+ conjugate gradient methods.Moreover, the HS method may fail in some cases if we choose inadequate parameters.Although the PRP+ conjugate gradient method has global convergence, its numerical performance is not better than that of the HS method in many situations.
Numerical experiments show that the new line search proposed in this paper is effective for the HS method in practical computation.The reason is that we used Lipschitz constant estimation in the new line search and could define an adequate initial step size s k so as to seek a suitable step size α k for the HS method, which reduced the function evaluations at each iteration and improved the efficiency of the HS method.
It is possible that the initial choice of step size (38) is reasonable for the SCG method in practical computation.All the facts show that choosing an adequate initial step size at each iteration is very important for line search methods, especially for conjugate gradient methods.

Conclusion
In this paper, a new form of Armijo-modified line search has been proposed for guaranteeing the global convergence of the HS conjugate gradient method for minimizing functions that have Lipschitz continuous partial derivatives.It needs one to estimate the local Lipschitz constant of the derivative of objective functions in practical computation.The global convergence and linear convergence rate of the HS method with the new Armijo-modified line search were analyzed under some mild conditions.Numerical results showed that the corresponding HS method with the new Armijo-modified line search was effective and superior to the HS conjugate gradient method with strong Wolfe line search.For further research we should not only find more techniques of estimating parameters but also carry out numerical experiments.

′ 0 =
max(L 0 , L, M ′ 0 ), we complete the proof.HS1, HS2, and HS3 denote the HS methods with the new Armijo-modified line search corresponding to the estimations (34)-(36), respectively.HS denotes the original HS method with strong Wolfe line search.PRP+ denotes the PRP method with β k = max 0, β P RP k and strong Wolfe line search .