Numerical solution of coupled system of Emden-Fowler equations using artificial neural network technique

ABSTRACT


Introduction
Differential equations are used to model natural phenomena in many fields of science, including biology, ecology, physics, engineering, chemistry, etc. Due to the vast applications of these differential equations in many branches of science( [1], [2]), there is a great need to solve these kinds of equations.The Emden-Fowler equation is also an important differential equation in astrophysics and physics.It is also very difficult to solve this equation numerically due to the singularity behavior of the equation at the point ζ = 0.This equation arises in the study of stellar structure [3], which involves the evolution of a star under the laws of physics.Stars' gravity balances the stars' core radiations.This balance is called hydrostatic equilibrium.The Emden-Fowler equation arises when this equilibrium state is modeled while studying the stellar structure.In 1870, Jonathan Lane [4] introduced the equation, and Jacob Emden [5] generalized it further.Lane -Emden equation is given as where n is the polytropic index.Many astrophysicists were interested in the behavior of the solution of these equations under some initial conditions.Fowler studied the these equations during 1914-1931.The more general form of (1) is where φ(ω) is a linear or non-linear function.
Emden-Fowler equation is also used in chemical reactors, fluid mechanics, and gas dynamics.
There are many variants of Emden-Fowler equations including coupled system of Emden-Fowler equations that reads as follows-

Related work
Machine learning techniques such as deep neural networks have a wide range of applications such as speech recognition, image classification [6], computer vision tasks [7], machine translation, drug design, climate science, cybersecurity ( [8], [9]).DNN is also very useful in automated driving.
Automative researchers are able to detect stop signs, traffic lights, pedestrians that helps lower accidents.The deep layers in DNN capture more variances.
The main issues of deep neural networks are robustness, stability and adversarial perturbation which are discussed in ( [10], [11], [12], [13]).The source of instability arises from the high depth of neural networks, where a "shattered gradient" effect was observed [14].More advances in the applications of deep neural networks are presented in ( [15], [16]).Solving differential equations using optimization strategies has been very popular for quite some time in the form of least squares method ( [17], [18]) and Galerkin methods [19].These techniques are also used in neural networks.Nowadays researchers are using ANN to solve differential equations.ANN solutions are in closedanalytic form and are differentiable.Solving a differential equation was first described by Lagaris et al. [20].Lee sen ten [21]  Asady and Nazarlue (2014) [30] solved the Fredholm integral equation using ANN with great precision.Nystorm (2018) [31] solved PDEs in complex geometries using deep feedforward artificial neural networks.Viana (2020) [32] introduced the recurrent neural networks to integrate ordinary differential equations.For more related works we can check Rackauckas [33], Wang Huan [34].

Neural network architecture and working
We are using vectorized algorithm (given in [35]) to solve the system of Emden-Fowler equations using deep neural networks (DNN).As indicated in Figure 1, the network has input layer containing one neuron as an independent variable for system of equations and n neurons in output layer.We form a matrix T = ζ 1 , ..., ζ r ∈ R 1×r using r sample points from domain and take N ∈ R n×r as output matrix.For example, N λ (ζ τ , Q λ ) is the λth output corresponding to τ th point, where Q λ is related to the weights and bias parameters.For each ζ ∈ [a, b] a trial solution, which is an extension of trial solution given in [35], is considered as where j = 1, 2, ..., n.This trial solution satisfies the initial conditions.The total cost function is converges to 0,

Forward propagation
We represent m − 1 and m layer node by s and j respectively.The value that goes into jth node, In the mth layer, the variable d represents the number of neurons.The equation ( 6) can be expressed in matrix form as follows: where Y m contains all the weights and b m is the bias.An appropriate activation function passes The values to the next hidden layer.For consistency, we select the identical activation function for all nodes within a layer.The values for the subsequent layer are as follows: where π m is activation function.For the r grid points, where τ = 1, 2, . . ., r, we proceed with the following steps.In the initial hidden layer, ), at the second hidden layer, similarly the output layer, So we have the matrix form as for second layer and so on.The output layer

The algorithm
Step 1 : Take r distinct points and form a vector Step 2 : We define the structure of neural networks such as number of layers M , input layer having one unit, M − 2 hidden layers with g m units, and an output layer with n units, where n corresponds to the number of unknowns in the system.
Step 3 : Initialize the parameters such as weights and bias, Q j , j = 1, 2, ..n, where n denotes number of neurons.Step 4 : Use forward propagation where m = 1, ...M −1, π m is the activation function for mth hidden layer.
• For output layer, • We use trial solution of order four in (4).
We need to initialize sets of parameters to find an unknown function.
Step 5 : Use (5) to calculate the cost, gradients, and learning parameters.Apply the automatic differentiation [36].
Step 6 : Utilize gradient descent or any other optimization method to update the parameters.We update the parameters according to following rule , where ε is learning rate and r denotes iteration.There are many advanced methods we can choose instead of the gradient descent method such as the moment method, which is a modification of the gradient method.The updating role is where γ is a coefficient between 0 and 1, we call it momentum.L j is a velocity parameter starting from 0 corresponding to each unknown.Another method is Nesterov accelerated gradient obtained from the moment method with update rule The adaptive gradient method is where c is a minimal number to avoid division by 0. Clearly, the learning rate is decaying because of it.The update rule for root mean square propagation is where ρ ∈ (0, 1) is forgetting factor.Another method is the Adam adaptive method which is a combination of the last two methods with updating rule where ρ 1 and ρ 2 , both belonging to the interval [0, 1), represent the rates of decay for moment estimation.Initially, we set the parameters F r and L r to 0.

Numerical illustrations
This section presents the implementation of an algorithm designed to solve known systems of second-order nonlinear differential equations.The initial step involves experimenting to determine the appropriate number of layers and neurons within the layer.The algorithm's accuracy is then validated by comparing the analytic and numerical solutions obtained using conventional methods.To achieve this objective, we simulate the problems related to systems of second-order nonlinear differential equations as follows: Example 1.Let us consider a problem of Emden-Fowler equations documented in [37].
with the initial conditions (ICs) 4.1.Investigating the network via experimentation for solving (9) We performed an experiment to determine how many neurons are in a layer.We examined how the number of hidden layers, the number of neurons, and the number of iterations affected the cost function.We have taken the trial solution up to n = 4.During the simulation, we varied the number of hidden neurons, specifically using sizes h = 4, 13, 27, 60, 90, and 180.We then compared the convergence of the cost function across these different sizes by plotting it against the number of iterations.In addition, we recorded the corresponding cost values and calculation times for each neuron size at the end of the iterations.It is worth noting that all other parameters remained constant throughout the experiment.
The findings illustrated in Figures 2(a)-(c) demonstrate that achieving the necessary level of accuracy is feasible with only one neuron in the hidden layer.However, many iterations are required when working with a smaller number of neurons, which can pose computational challenges.Increasing the number of neurons can improve the model's performance, but it is not always necessary.For example, when comparing a hidden layer with h = 60 neurons to one with h = 180 neurons, both achieve similar accuracy, but the former requires less computational time.
In our next experiment, we investigated the impact of the number of hidden layers on the model's performance.Previous research by Saeed et al.
(2023) [38] demonstrated that to solve nonlinear singular fractional differential equations, increasing the number of hidden layers results in improved performance in terms of error.Similarly, Panghal and Kumar (2021) [39] observed improved accuracy when simulating a delay and  first-order differential equation system with multiple hidden layers.However, Dufera (2021) [35] denied more than one hidden layer does not lead to better performance in his experiments when solving first-order a system of differential equations.We conducted numerous experiments to evaluate the performance of neural networks with one hidden layer against those with two hidden layers.The experiments involved varying the number of neurons in the hidden layers.All other parameters and activation functions, such as Tanh, remained the same in all experiments.The results, depicted in Figure 2(d), show that for a system of 2nd order differential equations (9), adding more hidden layers with an appropriate number of neurons leads to improved performance.

Numerical solutions for solving (9)
We opted for two hidden layers comprising 13 and 27 neurons (tuning in the plus-minus range was not expected to have a significant impact).We set m = 11 for this experiment and sampled uniform grid points within the given interval.ANN and analytic solutions are presented in Figure 3. Table 1 contains the numerical values, and Table 2 shows the error resulting from our approach.

Advantages of using ANN over a large number of data points for solving (9)
We operate uniform grid points of size (m=6, 11, 21, 55, 100) over the domain [0, 2] and [0, 3] to calculate solutions of the ODEs system using the ANN method.As shown by the mean absolute error (MAE) in Table 3, one major advantage of ANN techniques is that they maintain solution accuracy compared to smaller or larger grid points.
Similarly, experiments have been organized for the problems ( 10)- (13).Using this architecture, we compared the numerical and exact solution in Tables ( 4)- (11) for each example.We have also compared numerical and exact solution graphically with their error plots in Figure 4 and Figure 5.
with ICs Example 3. Let us consider a problem of Emden-Fowler equations documented in [40].
with ICs  Example 4. Let us consider a problem of Emden-Fowler equations documented in [41].
with ICs Example 5. Let us consider a problem of Emden-Fowler equations documented in [42].
with ICs Table 4. ANN and analytical solutions for system (10).
Grid point ANN ω 1 Analytic ω 1 ANN ω 2 Analytic ω 2 HAM ω 1 [37] ADM ω 1 [37] HAM ω 2 [37]  (d) Error plot of system (13)  Specifically, we found that for specific problems, even a single neuron in the hidden layer can achieve the necessary accuracy.In contrast, more significant numbers of neurons provide greater precision at the cost of increased parameter learning iterations.However, we caution against arbitrary increases in neuron size and recommend selecting an optimal size based on the underlying problem.
This study suggests the possibility of developing neural software that automatically adjusts the number of hidden layers and neurons based on the problem.Future research should focus on conducting additional analytical investigations to enhance the theoretical underpinnings of DNNs

Figure 2 .
Figure 2. Comparison of loss functions for ANN models with one vs two hidden layers.

( b )Figure 4 .
Figure 4. Comparison of ANN and Exact Solutions of Examples with Error Plot.

Figure 5 .
Figure 5.Comparison of ANN and exact solutions of examples with error plot.
> 0, ϑ 2 > 0 are real constants.φ 1 and φ 2 are nonlinear functions of ζ, ω 1 and ω 2 .The groundwork of artificial neural network(ANN) was started in 1943 when McCulloch and Pitts modeled a neuron as a switch that receives input from other neurons and, depending on total weighted input, is either activated or remains inactive.Thus, artificial neural networks are inspired by sensory processing of the brain.ANN can be created by simulating a network of model neurons in a computer.The structure comprises input layer, output layer, and hidden layers.Every layer contains neurons.By using an algorithm we can make the network learn to solve mathematical problems.A model neuron receives the input from other units, weighs each unit and add them up.If the total input is above a threshold we get the output.
*Corresponding Authorwhere ϑ 1 When a structure has multiple hidden layers, deep neural networks (DNNs) are used.Deep neural networks have been proposed as a way to produce more predictive models.Combining a large number of layers endows the model with high prediction power.

Table 2 .
(9)puting the absolute difference between ANN and exact solutions for system(9).

Table 3 .
MAE for different grid points over intervals.

Table 6 .
(11)uting the absolute difference between ANN and exact solutions for system(11).

Table 7 .
(12)uting the absolute difference between ANN and exact solutions for system(12).
sults show the stability between target and predicted results, this validates the model.Through various experiments with Python code and accompanying graphical simulations, we gained insight into the nature of the model architecture.

Table 11 .
(13)uting the absolute difference between ANN and exact solutions for system(13).solving systems of ODEs, encompassing areas such as delay differential equations and fractional differential equations.Such investigations would involve assessing the consistency, convergence, and suitability of DNNs in the context of solving systems of ODEs. for