Optimal control of fractional integro-differential systems based on a spectral method and grey wolf optimizer

Article History: Received 16 November 2018 Accepted 24 July 2019 Available 14 January 2020 This paper elaborated an effective and robust metaheuristic algorithm with acceptable performance based on solution accuracy. The algorithm applied in solution of the optimal control of fractional Volterra integro-differential (FVID) equation which be substituted by nonlinear programming (NLP). Subsequently the FIVD convert the problem to a NLP by using spectral collocation techniques and thereafter we execute the grey wolf optimizer (GWO) to improve the speed and accuracy and find the solutions of the optimal control and state as well as the optimal value of the cost function. It is mentioned that the utilization of the GWO is simple, due to the fact that the GWO is global search algorithm, the method can be applied to find optimal solution of the NLP. The efficiency of the proposed scheme is shown by the results obtained in comparison with the local methods. Further, some illustrative examples introduced with their approximate solutions and the results of the present approach compared with those achieved using other methods.


Introduction
The main purpose of this essay is to introduce an efficient approach for solving following optimal control problem (OCP): Problem A: Find optimal control u * and corresponding optimal state x * that minimize the quadratic performance index J = T 0 x 2 (t) + u 2 (t) + f (t)x(t) + g(t)u(t) dt, (1) subject to the fractional Volterra integrodifferential (FVID) equation D α x(t) = a(t)x(t)+b(t)u(t)+ t 0 (K(t, s))ϕ(x(s))ds, where a(t), b(t), g(t), f (t) are known and real valued functions which are belonged to L 2 [0, T ] and ϕ(x(s)) is a nonlinear function in terms of the unknown function x(s).
In various problems of physics, mechanics and engineering, fractional differential equations have been proved to be a valuable tool in the modeling of many phenomena. There are many applications in viscoelasticity, electrochemistry, control and electromagnetic, [1,2]. In consequence, the subject of fractional equations is gaining much importance and attention. Meanwhile, the study of OCP governed by fractional integro differential equations is also important as such systems occur in various problems of applied nature. Some approaches for numerical solutions of fractional optimal control problems can be found in [3][4][5][6].
*Corresponding Author 55 We need to be mindful only special cases of OCPs can be solved analytically, so choosing the best numerical schemes in terms of rapidity of convergence and accuracy is significant. The method implemented for discritizing mentioned OCP is spectral method, which is one of the most accurate method which used by several author and in different kind of functional problems for example see the [7,8] and the references in. The idea is to write the solution of OCP as a sum of Bernoulli polynomials, substituting these approximations in the OCP yields a NLP in the coefficients which can be solved using any metaheuristic algorithm presented in the literature for solving an optimization problem.
Among these algorithms, nature-inspired metaheuristic algorithms are appropriate for global searches according to ability in exploring globally and exploiting locally. Mirjalili et al. [9] proposed grey wolf optimizer (GWO) algorithm inspired by the behavior of grey wolves in nature. Indeed, the GWO algorithm simulated the leadership hierarchy and hunting behavior of grey wolves. GWO has shown a good performance when applied to solve nonlinear continuous optimization problems. The GWO algorithm is also compared with particle swarm optimization, gravitational search algorithm, differential evolution, evolutionary programming, and evolution strategy to confirm its results. So, GWO is theoretically able to solve our NLP. Some points on the advantages of the GWO have been expressed: • The social hierarchy helped GWO to visit the best solutions generated over the course of iteration. • The encircling procedure determined a circle-shaped neighborhood around the solutions which can be developed to higher dimensions as a hyper-sphere. • The random parameters helped candidate solutions to have hyper-spheres with different random radii. • The hunting approach accepted candidate solutions to detect the probable location of the prey. • Exploration and exploitation are warranted by the adaptive values of two parameters. • The adaptive values of parameters helped GWO to efficiently trade off between exploration and exploitation. • The GWO had only two main parameters to be controlled.
The paper is organized as follows: In section 2, the basic concepts about the Bernoulli polynomials and how to approximate the functions in terms of these polynomials is interpreted. Also, the operational matrices of fractional integration are mentioned. As we have provided some definition of fractional calculus. In section 3, the outline of our spectral scheme for discretizing aforementioned optimal control problem and obtaining the resulted NLP is presented. Section 4 is devoted to explain the grey wolf optimizer algorithm for solving the problem under consideration. In section 5, numerical results are reported to verify the applicability of the presented method in comparison with the other methods in the literature. Through these examples, the superiority of these three bases functions are also discussed. Finally, section 6 ends this paper with a brief conclusion and some remarks.

Preliminaries
In this section, we give some basic concepts we require.

Fractional Calculus
This section, reviews some basic definitions and notations of fractional integral and derivative which are applied further in this work [10].
Definition 1. The Riemann-Liouville fractional integral operator of order α, is defined by in which Γ(.) denotes the Gamma function and for α = 0, we set I 0 ξ(t) = ξ(t).
The one type of fractional derivative is Caputo fractional derivative, which is frequently used in applications.  I n−α D n ξ(t), n − 1 < α < n, n ∈ N, Note that for n − 1 < α < n, n ∈ N hold almost everywhere on [0, T ].

An overview on Bernoulli polynomials
Bernoulli polynomials of order m can be defined with the following formula [11], where α i , i = 0, 1, · · · , m are Bernoulli numbers. These numbers are a sequence of signed rational numbers which arise in the series expansion of trigonometric functions [12] and can be defined by the identity The first few Bernoulli numbers are with α 2i+1 = 0, i = 1, 2, 3, · · · . Bernoulli polynomials form a complete basis over the interval [0, 1] [13]. These polynomials satisfy the following formula [12] 1 The first few Bernoulli polynomials are Presume that H := L 2 [0, 1] and where m ∈ N∪{0} and β i 's are the Bernoulli polynomials. Since Y ⊂ H is a finite dimensional vector space, for every f ∈ H, there exists a unique Here, the function y 0 is called the best approximation to f out of Y . As y 0 ∈ Y , we may conclude that and where Q ∈ R (m+1)×(m+1) is said the dual matrix of Ψ(t) and given by For more details about best approximation see [13].

Bernoulli operational matrix of the fractional integration
In in recent years, the operational matrices have attracted researchers attention and applied to solving problems consisted of continuous operators (such as integral, derivative, delay, etc.). Moreover, the numerical methods via these operational matrices are easily implemented and have the following characteristics: ⋄ play a significant role as a preconditioner in inverse problems, ⋄ have higher accuracy due to their sparsity.
Although F γ given in [4] we use the different way and notations to show this matrix. For this purpose, Assume that S As a results F γ can be expressed as

Bernoulli polynomial collocation method
For discretization of the integro-differential dynamic system (2), we express the fractional state rate D γ x(t) and control variable u(t) in terms of Bernoulli polynomial as where X T and U T are unknown vectors and Ψ(t) given in (10). Using Lemma 2.1. Equation (12), x(t) can be represented by F γ is the fractional operational matrix of integration and E T = [x 0 , 0, . . . , 0] (1×(m+1) . Now we replace (16) and (17) in dynamic system (2) In order to specify the unknown coefficients in (18), we collocate this equation at m + 1 collocation points. So (18) can be rewrite as In above equation, t i , i = 0, . . . , m are the Chebyshev-Gauss-Lobatto nodes in [0, 1] which we chose them as suitable collocation points. In order to utilize the Gauss-Legendre (GL) quadrature formula, by means of transformation s = t i 2 (τ + 1), (19) convert to where τ j s are GL nodes, zeros of Legendre polynomials in the interval [−1, 1] and ω j s are the corresponding weights. Although explicit formulas for quadrature nodes are not known, the weights can be expressed in closed form by the following relation consequently, the controlled FVID (2) is reduced to m + 1 nonlinear algebraic. For discritization of the performance index stated in (1) , we approximate f (t) and g(t) by Bernoulli polynomials respectively as Substituting (21) in (1) conclude that Integrating (21) on [0, 1] results (23) in which Q given in (11). So, The OCP given in (1) and (23) is converted to a NLP with objective functional (22) and constraints (20). The resulted NLP problem is also large scale. So, it is of great importance to use an efficacious and compatible metaheuristic algorithm which generates the solutions with high computational decisions. This research utilizes a new metaheuristic algorithm, called grey wolf optimizer (GWO) to solve the problem under consideration. The next section describe the GWO and its elements and mechanisms to solve NLP governed by OCP.

Grey wolf optimizer
The proposed mathematical programming problem is the nonlinear and large scale. So, we need to solve the problem by studying both local and metaheuristic approaches. Among the local algorithms, the trust region method plays a vital role in solving large-scale nonlinear optimization problems because of its efciency [14,15]. However, it finds local solutions in a long amount of time as the T >> 1 or the problem dimension increases. To prevent this type of imperfection, a grey wolf optimizer (GWO) algorithm is proposed in this research. The GWO algorithm is a natureinspired metaheuristic algorithm which mimitates the leadership hierarchy and hunting structure of grey wolves.Therefore, high-performing metaheuristic approach with high computational highprecision numerical solutions and short execution time is implemented to solve the problem. The GWO obtains near-optimal solutions or the global minimum of objective functional in more efcient way. About the proposed algorithm, it is necessary to note that the GWO is really suitable and appropriate for the nonlinear optimization problems with the number of more variables and constraints, specially when solving large-sized instances of the problems [16][17][18][19]. For the problem, the values of objective functionals show that GWO's performance is better than local method in terms of the approximate solution of functions x(t) and u(t). So, on the base of above-mentioned points, one can come to the conclusion that the GWO is a favorable candidate for solving the problem if T >> 1.
The GWO algorithm has derivation-free procedures. In contrast to gradient-based optimization algorithms, this metaheuristic algorithm minimizes the problems stochastically. The optimization mechanism begins with random solution, and there is no need to compute the derivative of search regions or gradient information of the objective functionals to obtain the global minimum of the problem. This makes the GWO algorithm highly applicable for the NLP problems with unknown derivative information. On the other hand, the simplicity of the GWO is particularly advantageous in the presence of non-smooth objective functionals, for which exact algorithms may fail to reach their global solutions. Viability of the GWO is analyzed using some non-smooth mathematical functions and engineering design problems [20][21][22].So, the GWO algorithm is a favorable choice and a competitive algorithm when considering non-smooth, and non-linear functions. In this section, the essential nature of the GWO algorithm is explained. GWO algorithm is an new nature-inspired metaheuristic algorithm which was first introduced by Mirjalili et al. [9]. GWO is a technique inspired from the nature and grey wolves. The GWO algorithm simulated the leadership hierarchy and hunting behavior of grey wolves.
In leadership hierarchy, alpha, beta, delta, and omega were applied as four grey wolves. Also, hunting, searching for prey, encircling prey, and attacking prey were as the three main components of GWO. These new steps are discussed in the following section.

Social hierarchy
To formulate the social hierarchy of wolves, the fittest solution is considered as the alpha (α). Accordingly, the second and third best solutions are called beta (β) and delta (δ), respectively. The remaining candidate solutions are then represented as omega (ω). Also, the hunting mechanism is constructed by α, β, and δ. The ω wolves followed these three wolves.

Encircling prey
As seen in nature, grey wolves surround prey during the hunt. This surrounding behavior is given by: where t shows the current iteration, A and C are coefficient vectors, X p is the position vector of the prey, and X shows the position vector of a grey wolf. The vectors A and C are computed followed by: where r 1 and r 2 are random vectors in [0, 1]. Over the course of iterations, components of a are linearly reduced from 2 to 0.

Hunting
Grey wolves have the capability to identify the position of prey and envelop them. The hunt is normally conducted by the alpha. The beta and delta also partake in hunting sometimes. So, the first three best solutions is saved and the other search agents (including the omegas) is performed to update their positions based on the position of the best search agents. This process is stated with the following equations:

Attacking prey
At the end of the hunt, grey wolves rush at the prey when it stops moving. To model nearing the prey, the value of a is reduced from 2 to 0. Then, the variation range of A is also reduced by a. Especially, A is a random value in the interval [−2a, 2a].

Search for prey
Grey wolves chiefly explore based on the position of the alpha, beta, and delta. They get away from each other to search for prey and converge to rush prey. To formulate divergence, A with random values greater than 1 or less than −1 is applied to enforce the search agent for diverging from the prey. So, the GWO algorithm employing global search strategy and this confirms exploration.
As it is seen in Eq. (27), the C vector is also another factor of exploration. This factor obtains random weights for prey to stochastically accentuate (C > 1) or unaccentuate (C < 1) the efficacy of prey in determining the distance in Eq. (24). The C vector can be also considered as the efficacy of barriers to nearing prey in nature. Usually, the barriers in nature exist in the hunting paths of wolves and impede them from swiftly and comfortably nearing prey. Briefly, the search procedure starts with generating a random population of grey wolves. Over the course of iterations, alpha, beta, and delta wolves suggest the possible location of the prey. The distance from the prey is updated by each candidate solution The parameter a is reduced from 2 to 0 to accentuate exploration and exploitation, respectively. Candidate solutions favor divergence of the prey when | A| > 1 and move towards (converge) the prey when | A| < 1. Finally, the GWO algorithm stops when an end criterion is satisfied. The pseudo code of the GWO algorithm is presented in Algorithm 1.

Algorithm 1. GWO algorithm
1: Initialize the grey wolf population X i = (x i (t), u i (t))(i = 1, · · · , n) 2: Initialize a, A, and C 3: Calculate the fitness of each search agent 4: X α =the best search agent 5: X β =the second best search agent 6: X δ =the third best search agent 7: while (k < Max number of iterations) do 8: for each search agent do Update a, A, and C

Numerical experiments
In these examples, firstly the OCP is converted to a NLP with proposed method in section 3. The resulted NLP is solved by using fmincon function in MATLAB and GWO algorithm to find local and global minimum of constrained nonlinear function, respectively. For ease of references in tables, we use the proposed method and local method to demonstrate the results obtained from solving NLP with fmincon function and GWO algorithm, respectively. In order to demonstrate and justify the performance and the accuracy of our scheme on OCPs governed by fractional integro-differential equation, we consider the following examples. Example 1. Consider the following OCP subject to the nonlinear fractional integrodifferential equation The problem is to find the optimal control u * (t), which minimizes the quadratic performance index (31). For this problem, the exact solution in the case of α = 1 is given by [23] x(t) = e t , u(t) = e 3t . In Table 1, one can compare the optimal value of objective functional by utilizing GOW algorithm as well as local method in M = 7 and different values of α. The numerical results for x(t) and u(t) in M = 7 and α = 0.5, 0.7, 0.9 and 1 are plotted in Figures 1-2. In these figures, we see that our approximate solutions converge to exact solution. The results obtained with GOW method demonstrate validity and effectiveness of proposed method.  Example 2. Consider the following nonlinear problem [23] min J = The optimal control u * and corresponding optimal state x * for α = 1 are respectively 1 + 2t and e t 2 .  Table 2. It is obvious that we can achieve a better approximation with GWO algorithm against local method.
The optimal control u * (t) and corresponding optimal state x(t) for α = 1 are as follows: We solve this OCP using GWO and local methods for for M = 7 and various α. Figures 5-6 show that as α → 1, the approximate solutions with GWO algorithm tend to the exact solution in the case of α = 1. The value of objective function with GWO and local methods for M = 7 and different values of α, is shown in Table 3. From Table  3, we can see that the value of objective function based on GWO is all less than the least value of objective function obtained by local method.

Conclusions
By utilizing spectral method, OCP governed by fractional Volttera-integro differential equation is converted to a NLP. In this research, a powerful and efficacious metaheuristic algorithm called Grey Wolf Optimizer (GWO) is utilized to obtain the solutions of the optimal control and state as well as the optimal value of the objective function.
The GWO algorithm imitated the leadership hierarchy with four types of grey wolves and hunting procedure with searching for prey, encircling prey, and attacking prey. These strategies confirmed the preferable exploitation, exploration capability and efficient escape from local optimum of the GWO. Numerical experiments verify the validity and the applicability of the proposed method. Comparisons with the exact solution and other methods show that this technique is a powerful and efficient tool for solving the fractional OCP.