Additional fractional gradient descent identification algorithm based on multi-innovation principle for autoregressive exogenous models

Wang, Zishuo; Liang, Shuning; Chen, Beichen; Sun, Hongliang

doi:10.1038/s41598-024-70269-x

Download PDF

Article
Open access
Published: 27 August 2024

Additional fractional gradient descent identification algorithm based on multi-innovation principle for autoregressive exogenous models

Zishuo Wang¹,
Shuning Liang¹,
Beichen Chen^1,2 &
…
Hongliang Sun¹

Scientific Reports volume 14, Article number: 19843 (2024) Cite this article

812 Accesses
1 Citations
Metrics details

Subjects

Abstract

This paper proposed the additional fractional gradient descent identification algorithm based on the multi-innovation principle for autoregressive exogenous models. This algorithm incorporates an additional fractional order gradient to the integer order gradient. The two gradients are synchronously used to identify model parameters, thereby accelerating the convergence of the algorithm. Furthermore, to address the limitation of conventional gradient descent algorithms, which only use the information of the current moment to estimate the parameters of the next moment, resulting in low information utilisation, the multi-innovation principle is applied. Specifically, the integer-order gradient and additional fractional-order gradient are expanded into multi-innovation forms, and the parameters of the next moment are simultaneously estimated using multi-innovation matrices, thereby further enhancing the parameter estimation speed and identification accuracy. The convergence of the algorithm is demonstrated, and its effectiveness is verified through simulations and experiments involving the identification of a 3-DOF gyroscope system.

Fractional order system identification using a joint multi-innovation fractional gradient descent algorithm

Article Open access 28 December 2024

High performance adaptive step size fractional numerical scheme for solving fractional differential equations

Article Open access 15 April 2025

On the analysis and deeper properties of the fractional complex physical models pertaining to nonsingular kernels

Article Open access 27 September 2024

Introduction

Autoregressive exogenous (ARX) models are used to describe the dynamic relationship between inputs and outputs in discrete systems^1,2,3. These models can effectively capture the characteristics of time-series data and predict future time-series. Consequently, ARX models are widely used in various fields, such as precision machining⁴, heat conduction⁵, composite materials⁶, artificial intelligence⁷, the petrochemical industry⁸, and weather prediction⁹.

Parameter identification is a crucial technique for estimating the unknown parameters of a model by observing output data and input data. Using identification technology to construct models results in high accuracy, and it has thus garnered widespread attention^10,11. Various researchers have explored parameter identification methods for ARX models. For example, Dong et al.¹² developed a weighted hierarchical stochastic gradient (SG) descent algorithm to improve the parameter convergence accuracy. Li et al.¹³ studied a decoupled identification scheme based on neural fuzzy network and ARX model. Tu et al.¹⁴ proposed a conjugate gradient descent method that resulted in accelerated convergence. Jing et al.¹⁵ established an ARX model using a variable step-size SG descent algorithm. A parameter learning scheme using multi-signals was proposed¹⁶, which improve the model identification accuracy. Liang et al.¹⁷ extended the Nesterov accelerated gradient descent algorithm into a multi-innovation form and used the multi-innovation matrix to accurately identify ARX model parameters. Li et al.¹⁸ derived a multi-innovation extended SG descent method to improve the parameter estimation accuracy. Ding et al.¹⁹ applied the multi-innovation stochastic gradient (MISG) descent algorithm to the nonlinear ARX systems identification. Chen et al.²⁰ used Kalman filtering to estimate the ARX model output and combined the expectation maximisation algorithm to estimate parameters. Additionally, a Shannon principle-based forgetting factor gradient descent algorithm was proposed²¹, which improve the parameters convergence speed. Li et al.²² used correlation analysis method and adaptive Kalman filter to estimate the system parameters with measurement noises. Chen et al.²³ developed an improved multi-step gradient iteration algorithm based on the Kalman filter to identify ARX models with missing output data, resulting in enhanced parameter identification accuracy. A separation identification algorithm was proposed²⁴, and combined with filtering technology to identify the system parameters with colored noise. Li et al.²⁵ combined with correlation analysis theory and the data filtering technology to estimate the multi-input multi-output system parameters. Stojanovic et al.²⁶ estimated model parameters using the Masreliez–Martin filter. Other researchers²⁷ decomposed the ARX model into two subsystems and establish a two-stage iteration algorithm. Wang et al.²⁸ developed a three-stage subsystem generalised least squares identification algorithm. The SG algorithm was improved by incorporating a convergence index²⁹, resulting in enhanced parameter estimation accuracy. Additionally, Chen et al.³⁰ researched a particle filter based on SG identification algorithm. Li et al.³¹ developed a long short-term memory networks identification algorithm, and combined with the advantages of SG descent and root mean square propagation algorithms, an adaptive momentum estimation technique is created to optimize the networks parameters.

Notably, among the abovementioned methods for ARX model parameter identification, the gradient descent algorithm has been widely used owing to its broad applicability and ease of implementation in engineering scenarios. However, traditional integer-order SG descent algorithms rely on the convergence speed and direction of the gradient, resulting in low speed and accuracy of parameter identification. Compared with the integer-order gradient, the order selection of fractional-order gradient is arbitrary, which breaks the limitation of integer-order differential. Therefore, the fractional-order gradient has higher flexibility, which can improve the convergence speed and convergence accuracy of the algorithm^32,33. In recent years, many scholars have focused on the fractional order gradient. For example, Chen et al.³⁴ ensured that the fractional gradient could converge to the minimum point by changing the initial integration point. Wei et al.³⁵ derived three forms of fractional-order gradients and proved their convergence. Other researchers³⁶ established a fractional order stochastic gradient (FOSG) descent algorithm and an adaptive FOSG descent algorithm. Additionally, the hierarchical principle was used to design a fractional hierarchical gradient algorithm³⁷.

The FOSG directly extends the integer-order gradient into a fractional order. However, because of the significant dependence of a single fractional order gradient on the choice of fractional order, improper order selection might reduce the identification accuracy and speed of the algorithm. Moreover, the fractional gradient typically uses only the current moment information for identifying model parameters of the next moment, resulting in underutilisation of identification information. To address this limitation, this paper proposed the multi-innovation additional fractional gradient descent identification algorithm. The main contributions of this study can be summarized as follows:

The proposed algorithm uses an additional fractional-order gradient and the integer-order gradient synchronously to identify model parameters, thereby accelerating the convergence speed of the algorithm.
The multi-innovation principle is introduced to expand the integer-order gradient and additional fractional-order gradient into multi-innovation matrices.
The proposed method is compared with SG, FOSG and MISG to verify its superiority in convergence speed and convergence accuracy.
The proposed algorithm can avoid the inaccurate parameters estimation result caused by improper fractional order selection.

The remainder of this paper is organised as follows: The mathematical model of an ARX system is presented in Section “Autoregressive exogenous models”. Section “Additional fractional gradient descent identification algorithm” outlines the additional fractional gradient descent identification algorithm. Section “Multi-innovation additional fractional gradient descent identification algorithm and convergence analysis” describes the multi-innovation additional fractional gradient identification algorithm and analysis of its convergence. Section “Simulation and experiment” presents details of the simulation and experiment for evaluating the algorithm performance. Section “Conclusion” presents the concluding remarks.

Autoregressive exogenous models

The ARX model is considered

$$ A(z^{ - 1} )y(t) = B(z^{ - 1} )u(t) + v(t) $$

(1)

where $u\left( t \right)$ and $y\left( t \right)$ are the system input and output, respectively. $v\left( t \right)$ denotes white noise with finite variance $\sigma_{v}^{2}$. The polynomials $A(z^{ - 1} )$ and $B(z^{ - 1} )$ can be expanded as

$$ A(z^{ - 1} ) = 1 + a_{1} z^{ - 1} + a_{2} z^{ - 2} + \cdots + a_{n} z^{ - n} $$

(2)

$$ B(z^{ - 1} ) = 1 + b_{1} z^{ - 1} + b_{2} z^{ - 2} + \cdots + b_{n} z^{ - n} $$

(3)

where n is a known parameter, and $z^{ - n}$ denotes the difference operator.

The information vector $\varphi (t)$ is defined using Eqs. (2) and (3). Combined with $z^{ - n} y(t) = y(t - n)$ and $z^{ - n} u(t) = u(t - n)$, $\varphi (t)$ can be expressed as

$$ \varphi (t) = [ - y\left( {t - 1} \right), - y\left( {t - 2} \right), \cdots , - y\left( {t - n} \right),u\left( t \right),u\left( {t - 1} \right),u\left( {t - 2} \right), \cdots ,u\left( {t - n} \right)]^{{\text{T}}} $$

(4)

Next, the parameter vector $\theta$ is defined, and parameters $a_{1} ,a_{2} , \cdots ,a_{n} ,b_{1} ,b_{2} , \cdots ,b_{n}$ in Eqs. (2) and (3) are presented in the vector form

$$ \theta = \left[ {a_{1} ,a_{2} , \cdots ,a_{n} ,b_{1} ,b_{2} , \cdots ,b_{n} } \right]^{{\text{T}}} $$

(5)

Using Eqs. (2), (3), (4), (5), the Eq. (1) can be rewritten as

$$ y(t) = \varphi^{{\text{T}}} (t)\theta + v(t) $$

(6)

Additional fractional gradient descent identification algorithm

The additional fractional gradient descent identification algorithm adds an additional fractional gradient based on the integer gradient, and utilize the flexibility of fractional order to improve the convergence speed of the algorithm. The gradient descent algorithm is an iterative method. The unknown model parameters can be identified by minimising the criterion function through step-by-step iteration along the direction of gradient descent. The directions of the integer order and fractional order gradient correspond to the partial and fractional differentials of the function, respectively.

Fractional differential is an extension of integer differential. The differential order can be any real number or complex number. Compared with integer differential, fractional differential has three different definitions. In this study, the Caputo definition is used, expressed as³⁸

$$ {}_{{t_{0} }}^{C} D_{{t_{f} }}^{\alpha } f(x) = \frac{1}{\Gamma (m - \alpha )}\int_{{t_{0} }}^{{t_{f} }} {\frac{{f^{(m)} (\tau )}}{{(t_{f} - \tau )^{\alpha - m + 1} }}} d\tau $$

(7)

where $_{{t_{0} }}^{C} D_{{t_{f} }}^{\alpha }$ is the fractional calculus operator; $\alpha$ is the order of the operator; $m - 1 < \alpha < m,m \in {\mathbb{Z}}_{ + }$; $t_{0}$ and $t_{f}$ represent the low and high integral terminals, respectively; and $\Gamma (\alpha )$ represents the gamma function.

The Taylor series expansion form of the Caputo differential is

$$ {}_{{t_{0} }}^{C} D_{{t_{f} }}^{\alpha } f(x) = \sum\limits_{i = 1}^{ + \infty } {\left( \begin{gathered} \alpha - m \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} i - m \hfill \\ \end{gathered} \right)} \frac{{f^{(i)} (t_{0} )}}{\Gamma (i + 1 - \alpha )}(t_{f} - t_{0} )^{i - \alpha } $$

(8)

where $\left( \begin{gathered} p \hfill \\ q \hfill \\ \end{gathered} \right) = \frac{\Gamma (p + 1)}{{\Gamma (q + 1)\Gamma (p - q + 1)}},p \in {\mathbb{R}},q \in {\mathbb{N}}$.

The fractional-order gradient $\nabla^{\alpha } f(x)$ can be expressed as

$$ \nabla^{\alpha } f(x) = \mu \sum\limits_{i = 1}^{ + \infty } {\left( \begin{gathered} \alpha - 1 \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} i - 1 \hfill \\ \end{gathered} \right)} \frac{{f^{(i)} (x)}}{\Gamma (i + 1 - \alpha )}(t_{f} - t_{0} )^{i - \alpha } $$

(9)

where $0 < \alpha < 1$, and $\mu$ is the step-size factor. To avoid complex numbers or a zero denominator, Eq. (9) can be rewritten as

$$ \nabla^{\alpha } f(x) = \mu \frac{{f^{(1)} (x)}}{\Gamma (2 - \alpha )}(\left| {t_{f} - t_{0} } \right| + \varepsilon )^{1 - \alpha } $$

(10)

where $\varepsilon$ is a small non-negative number. Equation (10) can be extended to the case of $1 < \alpha < 2$³⁹.

Then, the criterion function can be expressed as

$$ J(\theta ) = \frac{1}{2}\left[ {y(t) - \varphi^{{\text{T}}} (t)\theta } \right]^{2} $$

(11)

The additional fractional gradient descent algorithm, which incorporates both the fractional order and integer order gradients to identify $\theta$, can be expressed as

$$ \hat{\theta }(t) = \hat{\theta }(t - 1) - \gamma \nabla J(\hat{\theta }(t - 1)) - \nabla^{\alpha } J(\hat{\theta }(t - 1)) $$

(12)

where $\hat{\theta }$ is the estimate of parameter vector $\theta$, and $\gamma$ denotes the step-size factor of the integer-order gradient. Substituting Eq. (11) into Eq. (12) yields

$$ \begin{gathered} \hat{\theta }(t) = \hat{\theta }(t - 1) - \gamma \nabla J(\hat{\theta }(t - 1)) - \nabla^{\alpha } J(\hat{\theta }(t - 1)) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \hat{\theta }(t - 1) + \gamma \varphi (t)[y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1)] + \mu {}_{{\hat{\theta }(t - 2)}}^{C} D_{{\hat{\theta }(t - 1)}}^{\alpha } J(\hat{\theta }(t - 1)) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \hat{\theta }(t - 1) + \gamma \varphi (t)[y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1)] + \mu \frac{{\nabla J(\hat{\theta }(t - 1))}}{\Gamma (2 - \alpha )}(\left| {\hat{\theta }(t - 1) - \hat{\theta }(t - 2)} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \hat{\theta }(t - 1) + \gamma \varphi (t)[y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1)] + \mu \frac{{\varphi (t)[y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1)]}}{\Gamma (2 - \alpha )}(\left| {\hat{\theta }(t - 1) - \hat{\theta }(t - 2)} \right| + \varepsilon )^{1 - \alpha } \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \hat{\theta }(t - 1) + \gamma \varphi (t)[y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1)] + \mu \frac{{\varphi (t)[y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1)]\Xi (\hat{\theta },\alpha ,t)}}{\Gamma (2 - \alpha )} \hfill \\ \end{gathered} $$

(13)

where $\mu \frac{{\varphi (t)[y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1)]\Xi (\hat{\theta },\alpha ,t)}}{\Gamma (2 - \alpha )}$ is the additional fractional gradient,$\Xi (\hat{\theta },\alpha ,t) = {\text{diag}}\left\{ {\left[ {\left| {\hat{\theta }_{1} (t - 1) - \hat{\theta }_{1} (t - 2)} \right| + \varepsilon } \right]^{1 - \alpha } ,\left[ {\left| {\hat{\theta }_{2} (t - 1) - \hat{\theta }_{2} (t - 2)} \right| + \varepsilon } \right]^{1 - \alpha } , \cdots ,\left[ {\left| {\hat{\theta }_{l} (t - 1) - \hat{\theta }_{l} (t - 2)} \right| + \varepsilon } \right]^{1 - \alpha } } \right\}$, and l is the number of identification parameters.$\gamma$ and $\mu$ can be expanded into the following form

$$ \gamma = 1/\overline{r}(t),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \overline{r}(t) = \overline{r}(t - 1) + \left\| {\varphi (t)} \right\|^{2} $$

(14)

$$ \mu = 1/r(t),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} r(t) = \overline{r}(t - 1) + \left\| {\Xi (\hat{\theta },\alpha ,t)\varphi (t)} \right\|^{2} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \overline{r}(0) = 1 $$

(15)

The output error is defined as in Eq. (13)

$$ e(t) = y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1) $$

(16)

Combining Eqs. (13)-(16), we can obtain the iterative formula of the additional fractional gradient descent algorithm, as follows

$$ \hat{\theta }(t) = \hat{\theta }(t - 1) + \frac{\varphi (t)e(t)}{{\overline{r}(t)}} + \frac{{\varphi (t)e(t)\Xi (\hat{\theta },\alpha ,t)}}{r(t)\Gamma (2 - \alpha )} $$

(17)

Equation (17) shows that the two gradients jointly identify the model parameters, which can avoid the algorithm not converging due to improper order selection of a single fractional gradient. Compared with a single integer gradient, the additional fractional gradient can make the parameters converge to the true value faster, even if the fractional order is small.

Multi-innovation additional fractional gradient descent identification algorithm and convergence analysis

Multi-innovation additional fractional gradient descent identification algorithm

The additional fractional gradient descent identification algorithm leverages the flexibility of fractional calculus to simultaneously identify unknown parameters through integer and fractional gradients. By simultaneously treating these two gradients, the convergence accuracy and speed of the algorithm can be improved. However, both these gradients use only the data at the current moment, resulting in underutilisation of the available information, thereby limiting the ability to improve the accuracy of parameter identifies during each iteration. Therefore, to further enhance the speed and accuracy of parameter identification, the multi-innovation principle is introduced, and a multi-innovation additional fractional gradient descent algorithm is established.

The multi-innovation principle involves combining current and past data to form a multi-innovation matrix, which is then used to estimate the current parameters^40,41. In this context,$y(t),\varphi^{{\text{T}}} (t),e(t)$, and $v(t)$ are termed single innovations, which are then extended to the multi-innovation matrices $Y(p,t),\Phi (p,t),E(p,t)$, and $V(p,t)$, respectively.

$$ Y(p,t) = \left[ {y(t),y(t - 1), \cdots ,y(t - p + 1)} \right]^{{\text{T}}} $$

(18)

$$ \Phi (p,t) = \left[ {\varphi (t),\varphi (t - 1), \cdots ,\varphi (t - p + 1)} \right]^{{\text{T}}} $$

(19)

$$ V(p,t) = \left[ {v(t),v(t - 1), \cdots ,v(t - p + 1)} \right]^{{\text{T}}} $$

(20)

$$ E(p,t) = \left[ {e(t),e(t - 1), \cdots ,e(t - p + 1)} \right]^{{\text{T}}} $$

(21)

where p denotes the multi-innovation length. Combining Eqs. (16), (18), and (19), Eq. (21) can be rewritten as

$$ \begin{gathered} E(p,t) = \left[ \begin{gathered} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} y(t) - \varphi^{{\text{T}}} (t)\hat{\theta }(t - 1) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} y(t - 1) - \varphi^{{\text{T}}} (t - 1)\hat{\theta }(t - 1) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \vdots \hfill \\ y(t - p + 1) - \varphi^{{\text{T}}} (t - p + 1)\hat{\theta }(t - 1) \hfill \\ \end{gathered} \right] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = Y(p,t) - \Phi^{{\text{T}}} (p,t)\hat{\theta }(t - 1) \hfill \\ \end{gathered} $$

(22)

The multi-innovation additional fractional gradient descent algorithm criterion function is defined as

$$ J_{1} (\theta ) = \frac{1}{2}\left\| {Y(p,t) - \Phi^{{\text{T}}} (p,t)\theta } \right\|^{2} $$

(23)

Substituting the single innovations with the multi-innovation matrix from Eqs. (18), (19), and (21), respectively, the multi-innovation additional fractional gradient descent algorithm can be formulated as

$$ \hat{\theta }(t) = \hat{\theta }(t - 1) + \frac{\Phi (p,t)E(p,t)}{{\overline{r}(t)}} + \frac{{\Phi (p,t)E(p,t)\Xi (\hat{\theta },\alpha ,t)}}{r(t)\Gamma (2 - \alpha )} $$

(24)

$$ \overline{r}(t) = \overline{r}(t - 1) + \left\| {\Phi (p,t)} \right\|^{2} $$

(25)

$$ r(t) = \overline{r}(t - 1) + \left\| {\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)} \right\|^{2} $$

(26)

Compared with Eq. (17), Eq. (24) uses more observation data in the parameter identification process and has a higher data utilization rate.

Convergence analysis

The analysis of convergence is crucial for the gradient descent algorithm, which determines the reliability and stability of the algorithm. It is a prerequisite for theoretical simulation and practical application. For clarity, the following lemmas are defined.

Lemma 1

Reference⁴² For the ARX model in Eq. (6) and multi-innovation additional fractional gradient descent algorithm presented in Eqs. (24), (25), (26), there exists a constant $0 < \overline{\alpha } \le \beta < \infty$ and an integer $N \ge n$ such that the following strong persistent excitation condition holds.

$$ \overline{\alpha }I_{n} \le \frac{1}{N}\sum\limits_{i = 0}^{N - 1} {\varphi (t + i)} \varphi^{{\text{T}}} (t + i) \le \beta I_{n} $$

(27)

Then $\overline{r}(t)$ in Eq. (25) satisfies the inequality

$$ n\overline{\alpha }(t - N + 1) \le \overline{r}(t) \le n\beta (t + N - 1) + 1 $$

(28)

Lemma 2

Reference⁴³ The following inequality holds

$$ 2xy \le ax^{2} + y^{2} /a $$

(29)

where a real numbers.

Lemma 3

Reference⁴³ Suppose the non-negative sequences $\left\{ {\kappa (t)} \right\},\left\{ {\psi_{t} } \right\}$ and $\left\{ {\beta_{t} } \right\}$ satisfy $\kappa (t + 1) \le (1 - \psi_{t} )\kappa (t) + \beta_{t}$,$\psi_{t} \in [0,1),\sum\limits_{t = 1}^{\infty } {\psi_{t} } = \infty ,\kappa (0) < \infty$. Then

$$ \lim {\kern 1pt} {\kern 1pt} {\kern 1pt} \sup \kappa (t{)} \le {\text{lim}}{\kern 1pt} {\kern 1pt} {\kern 1pt} \frac{{\beta_{t} }}{{\psi_{t} }} $$

(30)

Proof

Let the parameter estimation error be defined as

$$ \tilde{\theta }(t) = \hat{\theta }(t) - \theta $$

(31)

According Eqs. (6) and (18), (19), (20), we obtain

$$ Y(p,t) = \Phi^{{\text{T}}} (p,t)\theta + V(p,t) $$

(32)

Subtracting $\theta$ from both sides of Eq. (24) and combining Eqs. (22) and (32), we can obtain the iterative equation for the $\tilde{\theta }$.

$$ \begin{gathered} \tilde{\theta }(t) = \tilde{\theta }(t - 1) + \frac{\Phi (p,t)E(p,t)}{{\overline{r}(t)}} + \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)E(p,t)}}{r(t)\Gamma (2 - \alpha )} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \tilde{\theta }(t - 1) + \frac{\Phi (p,t)}{{\overline{r}(t)}}[Y(p,t) - \Phi^{{\text{T}}} (p,t)\hat{\theta }(t - 1)] + \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)}}{r(t)\Gamma (2 - \alpha )}[Y(p,t) - \Phi^{{\text{T}}} (p,t)\hat{\theta }(t - 1)] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \tilde{\theta }(t - 1) + \frac{\Phi (p,t)}{{\overline{r}(t)}}[\Phi^{{\text{T}}} (p,t)\theta + V(p,t) - \Phi^{{\text{T}}} (p,t)\hat{\theta }(t - 1)] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)}}{r(t)\Gamma (2 - \alpha )}[\Phi^{{\text{T}}} (p,t)\theta + V(p,t) - \Phi^{{\text{T}}} (p,t)\hat{\theta }(t - 1)] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \tilde{\theta }(t - 1) + \frac{\Phi (p,t)}{{\overline{r}(t)}}[ - \Phi^{{\text{T}}} (p,t)\tilde{\theta }(t - 1) + V(p,t)] + \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)}}{r(t)\Gamma (2 - \alpha )}[ - \Phi^{{\text{T}}} (p,t)\tilde{\theta }(t - 1) + V(p,t)] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}} - \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{r(t)\Gamma (2 - \alpha )}} \right]\tilde{\theta }(t - 1) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \frac{\Phi (p,t)V(p,t)}{{\overline{r}(t)}} + \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{r(t)\Gamma (2 - \alpha )} \hfill \\ \end{gathered} $$

(33)

By taking the norm on both sides of Eq. (33), the following expression is obtained.

$$ \begin{gathered} \left\| {\tilde{\theta }(t)} \right\|^{2} = \left\| {\left[ {I_{n} - \frac{{\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{\bar{r}(t)}} - \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right]\tilde{\theta }(t - 1)} \right\|^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + 2\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{\bar{r}(t)}} - \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right]\frac{{\Phi (p,t)V(p,t)}}{{\bar{r}(t)}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + 2\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{\bar{r}(t)}} - \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right]\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{{r(t)\Gamma (2 - \alpha )}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \frac{{\Phi (p,t)V(p,t)}}{{\bar{r}(t)}}\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{{r(t)\Gamma (2 - \alpha )}} + \left\| {\frac{{\Phi (p,t)V(p,t)}}{{\bar{r}(t)}}} \right\|^{2} + \left\| {\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right\|^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \lambda _{{\max }} \left[ {I_{n} - \frac{{\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{\bar{r}(t)}} - \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right]\left\| {\tilde{\theta }(t - 1)} \right\|^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + 2\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{\bar{r}(t)}} - \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right]\frac{{\Phi (p,t)V(p,t)}}{{\bar{r}(t)}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + 2\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{\bar{r}(t)}} - \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi ^{{\text{T}}} (p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right]\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{{r(t)\Gamma (2 - \alpha )}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \frac{{\Phi (p,t)V(p,t)}}{{\bar{r}(t)}}\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{{r(t)\Gamma (2 - \alpha )}} + \left\| {\frac{{\Phi (p,t)V(p,t)}}{{\bar{r}(t)}}} \right\|^{2} + \left\| {\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{{r(t)\Gamma (2 - \alpha )}}} \right\|^{2} \hfill \\ \end{gathered} $$

(34)

According to the discussion in Ref.⁴⁴ and the definition of $\Xi (\hat{\theta },\alpha ,t)$ in Eq. (8), $0 < \hat{\theta }_{i} (t - 1) - \hat{\theta }_{i} (t - 2) < 1,{\kern 1pt}$ $i = 1,2, \cdots ,l$ holds. Then

$$ \left\{ \begin{gathered} \varepsilon^{{\frac{1 - \alpha }{2}}} < \Xi^{\frac{1}{2}} (\hat{\theta },\alpha ,t) \le (1 + \varepsilon )^{{\frac{1 - \alpha }{2}}} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} 0 < \alpha \le 1 \hfill \\ (1 + \varepsilon )^{{\frac{1 - \alpha }{2}}} < \Xi^{\frac{1}{2}} (\hat{\theta },\alpha ,t) \le \varepsilon^{{\frac{1 - \alpha }{2}}} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} 1 < \alpha \le 2 \hfill \\ \end{gathered} \right. $$

(35)

Moreover, $\Xi (\hat{\theta },\alpha ,t) = \max \left\{ {\varepsilon^{1 - \alpha } ,\left( {1 + \varepsilon } \right)^{1 - \alpha } } \right\}$ with $0 < \alpha < 2$.According to Lemma 1

$$ \begin{gathered} \frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{r(t)\Gamma (2 - \alpha )} = \frac{{\Xi (\hat{\theta },\alpha ,t)}}{r(t)\Gamma (2 - \alpha )}\sum\limits_{i = 0}^{N - 1} {\varphi (t - i)} \varphi^{{\text{T}}} (t - i) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \ge \frac{{\varepsilon^{1 - \alpha } N\overline{\alpha }}}{r(t)\Gamma (2 - \alpha )} \hfill \\ \end{gathered} $$

(36)

where $\overline{\alpha } > 0$, N > 0, within the range of $0 < \alpha < 2$,$\Gamma (2 - \alpha ) > 0$. Given that $r(t) = \overline{r}(t - 1) + \left\| {\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)} \right\|^{2}$, it can be found that $0 \le \overline{r}(m) - \overline{r}(0) \le r(m + 1)$ and $0 \le \left\| {\Xi (\theta ,\alpha ,t)\Phi (p,t)} \right\|^{2} \le r(m + 1)$ for $m = 1,2, \cdots ,t - 1$. Then,$r(t) > 0$. Thus, Eq. (36) is greater than 0, and the following expression is obtained from Eq. (34).

$$ \left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}} - \frac{{\Xi (\theta ,\alpha ,t)\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{r(t)\Gamma (2 - \alpha )}} \right] \le I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}} $$

(37)

According to Lemma 1, as p = N, we obtain

$$ \begin{gathered} I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}} = I_{n} - \frac{1}{{\overline{r}(t)}}\sum\limits_{i = 0}^{N - 1} {\varphi (t - i)} \varphi^{{\text{T}}} (t - i) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \left[ {1 - \frac{{N\overline{\alpha }}}{{\overline{r}(t)}}} \right]I_{n} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \left[ {1 - \frac{{N\overline{\alpha }}}{n\beta (t - N + 1) + 1}} \right]I_{n} \hfill \\ \end{gathered} $$

(38)

Combining Eqs. (36)–(38) and Lemma 1, we obtain the expectation of Eq. (34)

$$ \begin{gathered} {\text{E}}\left[ {\left\| {\tilde{\theta }(t)} \right\|^{2} } \right] \le \left[ {1 - \frac{{N\overline{\alpha }}}{n\beta (t - N + 1) + 1}} \right]{\text{E}}\left[ {\left\| {\tilde{\theta }(t - 1)} \right\|^{2} } \right] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + 2{\text{E}}\left\{ {\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}}} \right]\frac{\Phi (p,t)V(p,t)}{{\overline{r}(t)}}} \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + 2{\text{E}}\left\{ {\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}}} \right]\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{r(t)\Gamma (2 - \alpha )}} \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + 2{\text{E}}\left\{ {\frac{\Phi (p,t)V(p,t)}{{\overline{r}(t)}}\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{r(t)\Gamma (2 - \alpha )}} \right\} + \frac{{E\left[ {\left\| {\Phi (p,t)V(p,t)} \right\|^{2} } \right]}}{{\overline{r}^{2} (t)}} + \frac{{E\left[ {\left\| {\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)} \right\|^{2} } \right]}}{{(r(t)\Gamma (2 - \alpha ))^{2} }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = Q_{1} + Q_{2} + Q_{3} + Q_{4} + Q_{5} \hfill \\ \end{gathered} $$

(39)

where

$$ Q_{1} = \left[ {1 - \frac{{N\overline{\alpha }}}{n\beta (t - N + 1) + 1}} \right]{\text{E}}\left[ {\left\| {\tilde{\theta }(t - 1)} \right\|^{2} } \right] + \frac{{E\left[ {\left\| {\Phi (p,t)V(p,t)} \right\|^{2} } \right]}}{{\overline{r}^{2} (t)}} $$

$$ Q_{2} = 2{\text{E}}\left\{ {\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}}} \right]\frac{\Phi (p,t)V(p,t)}{{\overline{r}(t)}}} \right\} $$

$$ Q_{3} = 2{\text{E}}\left\{ {\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}}} \right]\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{r(t)\Gamma (2 - \alpha )}} \right\} $$

$Q_{4} = 2{\text{E}}\left\{ {\frac{\Phi (p,t)V(p,t)}{{\overline{r}(t)}}\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{r(t)\Gamma (2 - \alpha )}} \right\}$, $Q_{5} = \frac{{E\left[ {\left\| {\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)} \right\|^{2} } \right]}}{{(r(t)\Gamma (2 - \alpha ))^{2} }}$.

According to Lemma 1 and Eq. (35), the following inequality is obtained

$$ \begin{gathered} E\left[ {\left\| {\Phi (p,t)V(p,t)} \right\|^{2} } \right] \le E\left\{ {\lambda_{\max } \left[ {\Phi (p,t)\Phi^{{\text{T}}} (p,t)} \right]\left\| {V(p,t)} \right\|^{2} } \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le p\beta E\left[ {\left\| {V(p,t)} \right\|^{2} } \right] = p^{2} \beta \sigma_{v}^{2} = N^{2} \beta \sigma_{v}^{2} \hfill \\ \end{gathered} $$

(40)

$$ \begin{gathered} E\left[ {\left\| {\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)} \right\|^{2} } \right] \le E\left\{ {\lambda_{\max } \left[ {\Phi (p,t)\Phi^{{\text{T}}} (p,t)} \right]\left\| {\Xi (\hat{\theta },\alpha ,t)} \right\|^{2} \left\| {V(p,t)} \right\|^{2} } \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le p\beta E\left[ {\left\| {\Xi (\hat{\theta },\alpha ,t)} \right\|^{2} \left\| {V(p,t)} \right\|^{2} } \right] = p^{2} \beta (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = N^{2} \beta (1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} \hfill \\ \end{gathered} $$

(41)

Substituting Eqs. (39) and (28) into $Q_{1}$ yields

$$ Q_{1} = \left[ {1 - \frac{{N\overline{\alpha }}}{n\beta (t - N + 1) + 1}} \right]{\text{E}}\left[ {\left\| {\tilde{\theta }(t - 1)} \right\|^{2} } \right] + \frac{{N^{2} \beta \sigma_{v}^{2} }}{{[n\overline{\alpha }(t - N + 1) + 1]^{2} }} $$

(42)

We take the limit of $Q_{1}$ according to Lemma 3

$$ \begin{gathered} \mathop {\lim }\limits_{t \to \infty } Q_{1} \le \mathop {\lim }\limits_{t \to \infty } \frac{{N^{2} \beta \sigma_{v}^{2} }}{{a^{2} [n\overline{\alpha }(t - N + 1) + 1]^{2} }}\frac{n\beta (t - N + 1) + 1}{{N\overline{\alpha }}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \mathop {\lim }\limits_{t \to \infty } \frac{{N\beta \sigma_{v}^{2} [n\beta (t - N + 1) + 1]}}{{\alpha [n\alpha (t - N + 1) + 1]^{2} }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \sim \frac{{N\beta^{2} \sigma_{v}^{2} }}{{n\overline{\alpha }^{3} }}\frac{1}{t} \hfill \\ \end{gathered} $$

(43)

When $t \to \infty$, $Q_{1} \to 0$. Now, we use Lemma 2 and Eqs. (38) and (40) to prove the convergence of $Q_{2}$.

$$ \begin{gathered} Q_{2} = 2{\text{E}}\left\{ {\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}}} \right]\frac{\Phi (p,t)V(p,t)}{{\overline{r}(t)}}} \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le a\left[ {1 - \frac{N\alpha }{{n\beta (t - N + 1) + 1}}} \right]{\text{E}}\left[ {\left\| {\tilde{\theta }(t - 1)} \right\|^{2} } \right] + \frac{1}{a}{\text{E}}\left[ {\left\| {\frac{\Phi (p,t)V(p,t)}{{r(t)}}} \right\|^{2} } \right] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \left[ {1 - \frac{N\alpha }{{n\beta (t - N + 1) + 1}}} \right]{\text{E}}\left[ {\left\| {\tilde{\theta }(t - 1)} \right\|^{2} } \right] + \frac{1}{{a^{2} }}{\text{E}}\left[ {\left\| {\frac{\Phi (p,t)V(p,t)}{{r(t)}}} \right\|^{2} } \right] \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \left[ {1 - \frac{N\alpha }{{n\beta (t - N + 1) + 1}}} \right]{\text{E}}\left[ {\left\| {\tilde{\theta }(t - 1)} \right\|^{2} } \right] + \frac{{N^{2} \beta \sigma_{v}^{2} }}{{a^{2} [n\overline{\alpha }(t - N + 1) + 1]^{2} }} \hfill \\ \end{gathered} $$

(44)

Similarly, according to Lemma 3, Eq. (44) can be rewritten as

$$ \begin{gathered} \mathop {\lim }\limits_{t \to \infty } Q_{2} \le \mathop {\lim }\limits_{t \to \infty } \frac{{N^{2} \beta \sigma_{v}^{2} }}{{a^{2} [n\alpha (t - N + 1) + 1]^{2} }}\frac{n\beta (t - N + 1) + 1}{{N\alpha }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \mathop {\lim }\limits_{t \to \infty } \frac{{N\beta \sigma_{v}^{2} [n\beta (t - N + 1) + 1]}}{{\alpha a^{2} [n\alpha (t - N + 1) + 1]^{2} }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \sim \frac{{N\beta^{2} \sigma_{v}^{2} }}{{na^{2} \alpha^{3} }}\frac{1}{t} \hfill \\ \end{gathered} $$

(45)

When $t \to \infty$, $Q_{2} \to 0$. Because $r(t) = \overline{r}(t - 1) + \left\| {\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)} \right\|^{2}$, it follows that $r(t) > \overline{r}(t - 1)$. We integrate Lemma 2 and Eqs. (40), (41) to prove the convergence of $Q_{{_{3} }}$.

$$ \begin{gathered} Q_{3} = 2{\text{E}}\left\{ {\tilde{\theta }(t - 1)\left[ {I_{n} - \frac{{\Phi (p,t)\Phi^{{\text{T}}} (p,t)}}{{\overline{r}(t)}}} \right]\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{r(t)\Gamma (2 - \alpha )}} \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \left[ {1 - \frac{{N\overline{\alpha }}}{n\beta (t - N + 1) + 1}} \right]{\text{E}}\left[ {\left\| {\tilde{\theta }(t - 1)} \right\|^{2} } \right] + \frac{{(1 + \varepsilon )^{2(1 - \alpha )} N^{2} \beta \sigma_{v}^{2} }}{{a^{2} [n\overline{\alpha }(t - N) + 1]^{2} \Gamma^{2} (2 - \alpha )}} \hfill \\ \end{gathered} $$

(46)

According to Lemma 3, Eq. (46) can be rewritten as

$$ \begin{gathered} \mathop {\lim }\limits_{t \to \infty } Q_{3} \le \mathop {\lim }\limits_{t \to \infty } \frac{{(1 + \varepsilon )^{2(1 - \alpha )} N^{2} \beta \sigma_{v}^{2} }}{{a^{2} [n\overline{\alpha }(t - N) + 1]^{2} \Gamma^{2} (2 - \alpha )}}\frac{n\beta (t - N + 1) + 1}{{N\overline{\alpha }}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \mathop {\lim }\limits_{t \to \infty } \frac{{(1 + \varepsilon )^{2(1 - \alpha )} N\beta \sigma_{v}^{2} [n\beta (t - N) + n\beta + 1]}}{{\alpha a^{2} [n\overline{\alpha }(t - N) + 1]^{2} \Gamma^{2} (2 - \alpha )}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \sim \frac{{(1 + \varepsilon )^{2(1 - \alpha )} N\beta \sigma_{v}^{2} [\beta + n\beta + 1]}}{{n\overline{\alpha }^{3} a^{2} \Gamma^{2} (2 - \alpha )}}\frac{1}{t} \hfill \\ \end{gathered} $$

(47)

When $t \to \infty$, $Q_{3} \to 0$. Similarly, we combine Lemmas 1 and 3 as well as Eqs. (40)-(41) to prove the convergence of Q₄.

$$ \begin{gathered} \mathop {\lim }\limits_{t \to \infty } Q_{4} = 2{\text{E}}\left\{ {\frac{\Phi (p,t)V(p,t)}{{\overline{r}(t)}}\frac{{\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)}}{r(t)\Gamma (2 - \alpha )}} \right\} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \frac{{2E[\Xi (\theta ,\alpha ,t)\Phi (p,t)\Phi^{{\text{T}}} (p,t)V(p,t)V^{{\text{T}}} (p,t)]}}{{\overline{r}(t)\overline{r}(t - 1)\Gamma (2 - \alpha )}} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \frac{{2[(1 + \varepsilon )^{2(1 - \alpha )} N\beta \sigma_{v}^{2} ]}}{[n\beta (t - N + 1) + 1][n\beta (t - N) + 1]\Gamma (2 - \alpha )} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \sim \frac{{(1 + \varepsilon )^{2(1 - \alpha )} N\sigma_{v}^{2} }}{{n^{2} \beta \Gamma (2 - \alpha )}}\frac{1}{{t^{2} }} \hfill \\ \end{gathered} $$

(48)

When $t \to \infty$, $Q_{4} \to 0$. Lastly, we integrate Lemmas 1 and 3 and Eq. (40) to prove the convergence of Q₅.

$$ \begin{gathered} \mathop {\lim }\limits_{t \to \infty } Q_{5} = \frac{{E\left[ {\left\| {\Xi (\hat{\theta },\alpha ,t)\Phi (p,t)V(p,t)} \right\|^{2} } \right]}}{{(r(t)\Gamma (2 - \alpha ))^{2} }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \le \frac{{(1 + \varepsilon )^{2(1 - \alpha )} N^{2} \beta \sigma_{v}^{2} }}{{[n\beta (t - N) + 1]^{2} \Gamma (2 - \alpha )^{2} }} \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \sim \frac{{(1 + \varepsilon )^{2(1 - \alpha )} \sigma_{v}^{2} }}{{n^{2} \beta \Gamma (2 - \alpha )^{2} }}\frac{1}{{t^{2} }} \hfill \\ \end{gathered} $$

(49)

Based on this analysis along with Eqs. (39), (43), (45), and (47), (48), (49), we conclude that the expectation of the parameter identification error ${\text{E}}\left[ {\left\| {\tilde{\theta }} \right\|^{2} } \right]$ converges to 0, thereby proving the algorithm convergence.

The multi-innovation additional fractional gradient descent identification algorithm steps are summarized below.

Algorithm	Multi-innovation additional fractional gradient descent identification algorithm
Step 1	Obtain model input data u(t) and output data y(t). Determine the ARX model unknown parameters $\theta$
Step 2	Determine the multi-innovation length p. According to Eq. (18)and Eq. (19) to Construct multi-innovation matrix $Y(p,t)$ and $\Phi (p,t)$
Step 3	Determine the number of iterations and the fractional gradient order $\alpha$
Step 4	Use integer order and fractional order gradient in Eq. (24) to estimate model parameters $\hat{\theta }$ synchronously
Step 5	The identification results are used to update the estimate parameter $\hat{\theta }(t)$ in Eq. (24)
Step 6	Repeat steps 4–5. When the algorithm is iterated to the set number, obtain the final identification result

Simulation and experiment

To verify the effectiveness of the algorithm, we use a simulation example and conduct a three degree-of-freedom (3-DOF) gyroscope model identification experiment. The proposed algorithm is compared with the SG, FOSG³⁶ and MISG¹⁹ algorithms. The contributions of the additional fractional orders and multi-innovation lengths to the performance of the algorithm are examined. The model identification accuracy is evaluated using the following index

$$ \delta { = }\frac{{\left\| {\hat{\theta }(t) - \theta } \right\|}}{\left\| \theta \right\|} $$

(50)

Numerical simulation

Consider the ARX model as follow

$$ y(t) = a_{1} y(t - 1){ + }a_{2} y(t - 2) + b_{1} u(t) + b_{2} u(t - 1) + v\left( t \right) $$

(51)

The parameters to be identified are $\theta = \left[ {a_{1} ,a_{2} ,b_{1} ,b_{2} } \right]^{{\text{T}}} = \left[ {1.5, - 0.65,1.25,0.85} \right]^{{\text{T}}}$, the $u\left( t \right)$ is a persistent excitation signal, and $v(t)$ is a white noise with variance $\sigma_{v}^{2} = 0.6^{2}$. The proposed algorithm is compared with the SG, FOSG, and MISG methods. In the proposed algorithm and FOSG method, the fractional order is set as $\alpha = 1.2$. The multi-innovation length for the MISG method and proposed algorithm is set as p = 3. The identification results are shown in Figs. 1, 2 and summarised in Table 1.

Table 1 Comparison of identification results with different algorithms.

Full size table

As indicated in Fig. 1 and Table 1, compared with the existing approaches, the proposed algorithm introduces an additional fractional gradient term. The integer order and fractional order gradient simultaneously identify the model parameters, resulting in higher convergence speed and accuracy, smaller the identified parameter error. Figure 2 shows that the multi-innovation additional fractional-order gradient descent algorithm can promptly and accurately identify four different unknown parameters, highlighting its effectiveness.

Subsequently, we examine the influence of multi-innovation length p on the algorithm performance. To this end, p is set as 1, 3, 5, and 7, with the fractional order being $\alpha = 1.2$. The identification results are shown in Figs. 3, 4 and outlined in Table 2.

Table 2 Comparison of identification results with different multi-innovation length.

Full size table

Figures 3, 4 and Table 2 show that with increasing length, the proposed algorithm exhibits enhanced convergence speed and parameter identification accuracy. This improvement is attributable to the additional fractional gradient descent algorithm synchronously extending the single innovations of the fractional and integer gradients into a multi-innovation matrix, thereby enhancing the data utilisation. When a single innovation (p = 1) is transformed to a multi-innovation matrix (p = 7), the evaluation index $\delta$ decreases from 0.1550 to 0.0196 after 2000 iterations.

Next, we examine the influence of different fractional orders on the identification speed and accuracy of the proposed algorithm. To this end, we set the p = 3 and fractional orders as $\alpha = 0.5,0.8,1.2,1.5$. The identification results are shown in Figs. 5, 6 and summarised in Table 3.

Table 3 Comparison of identification results with different fractional orders.

Full size table

It can be seen from Figs. 5, 6 and Table 3 that as the fractional order increases, the parameter identification error gradually decreases. Moreover, owing to the presence of both the integer order and fractional order gradients, the proposed algorithm maintains superior convergence speed and accuracy compared with the integer-order gradient descent algorithm, especially because of the small fractional order. Even when $\alpha = 0.5$, the identification error ($\delta = 0.0286$) is still smaller than the SG algorithm ($\delta = 0.2241$ in Table 1). This outcome further verifies the effectiveness of the proposed algorithm.

Experiment

The experimental analysis is conducted using a 3-DOF gyroscope system, which shown in Fig. 7. It contains a gyroscope, motor, data acquisition card, and power amplifier. The computer controls the input motor torque and transmits the signal to the data acquisition card through the Quarc 2020 real-time control software. The power amplifier amplifies the signal and applies it to the motor to precisely control the attitude angle of the gyroscope.

The 3-DOF gyroscope model can be expressed as follows

$$ y_{e} (t) = a_{1} y(t - 1) + a_{2} y(t - 2) + b_{1} u(t - 1) $$

(52)

The model error is evaluated using the following index

$$ {\text{Error}} = \sum\limits_{t = 0}^{{T_{f} }} {\left| {y_{{\text{e}}} (t) - y_{id} (t)} \right|} $$

(53)

where $y_{id} (t)$ is identified system output, $T_{f}$ is the termination time of the system operation.

To determine the system input and output, the Quarc database module in MATLAB/Simulink is used to build the system input and output measurement units. The data collection step length is 0.004 s, and the collection period is 50 s. The proposed algorithm is compared with the SG, FOSG³⁶, and MISG¹⁹ methods, we set the p = 5 and fractional order as $\alpha = 1.2$. The experiment data and identified result are shown in Figs. 8, 9, 10, summarised in Tables 4 and 5.

Table 4 Identification results with different methods.

Full size table

Table 5 Identification results with different fractional orders.

Full size table

Figure 8 shows that the proposed algorithm can accurately identify the parameters of the 3-DOF gyroscope system. It can be seen from Fig. 9 and Table 4 that compared with the SG, FOSG and MISG algorithms, the proposed algorithm has the smallest identification error and can more accurately describe the dynamic characteristics of the gyroscope system. Figure 10 and Table 5 show that when $\alpha$ takes different orders, the identification results are basically consistent, which further verifies that the proposed algorithm has high flexibility in engineering applications.

Conclusion

This paper introduces an additional fractional gradient descent identification algorithm based on the multi-innovation principle for ARX systems. The proposed algorithm incorporates a fractional order gradient along with the integer-order gradient and leverages the flexibility of fractional order calculus to improve the convergence speed. Additionally, it extends single innovations to multi-innovation matrices, enabling further enhanced data utilisation and parameter identification accuracy. The convergence of the algorithm is analysed, and its effectiveness is verified through simulations and experiment. The results show that the proposed approach outperforms the SG, FOSG, and MISG algorithms in terms of the convergence speed and accuracy. With increasing multi-innovation length and fractional order, the parameter identification accuracy and convergence speed improve. In terms of limitations, the fractional order $\alpha$ of the current algorithm is fixed. Our future work will consider adaptive fractional order selection, and extend the adaptive multi-innovation additional fractional gradient descent identification algorithm to nonlinear systems and time-delay systems.

Data availability

This paper completely describes the theoretical research. The collection of data set is random according to different readers, but some codes can be provided by contacting the corresponding author.

References

Bouzrara, K., Garna, T., Ragot, J. & Messaoud, H. Decomposition of an ARX model on Laguerre orthonormal bases. ISA Trans. 51, 848–860 (2012).
Article PubMed Google Scholar
Jin, J. J. et al. Model identification and analysis for parallel permanent magnetic suspension system based on ARX model. Int. J. Appl. Electromagn. Mech. 52, 145–152 (2016).
Article Google Scholar
Ikeda, A., Fujita, K. & Takewaki, I. Story-wise system identification of actual shear building using ambient vibration data and ARX model. Eearthq. Struct. 7, 1093–1118 (2014).
Article Google Scholar
Saxena, A., Dubey, Y. M. & Kumar, M. ARX and ARMAX modelling of SBCNC-60 machine for surface roughness and MRR with optimization of system response using PSO. J. Comb. Optim. 45, 56 (2023).
Article Google Scholar
Pohjoranta, A., Halinen, M., Pennanen, J. & Kiviaho, J. Solid oxide fuel cell stack temperature estimation with data-based modeling-designed experiments and parameter identification. J. Power Sources 277, 464–473 (2015).
Article ADS CAS Google Scholar
Wu, H. J., Yuan, S. F., Zhang, X., Yin, C. L. & Ma, X. R. Model parameter estimation approach based on incremental analysis for lithium-ion batteries without using open circuit voltage. J. Power Sources 287, 108–118 (2015).
Article ADS CAS Google Scholar
Zardian, M. G. & Ayob, A. Intelligent modelling and active vibration control of flexible manipulator system. J. Vibroeng. 17, 1879–1897 (2015).
Google Scholar
Xu, W. Q., Peng, H., Tian, X. Y. & Peng, X. Y. DBN based SD-ARX model for nonlinear time series prediction and analysis. Appl. Intell. 50, 4586–4601 (2020).
Article Google Scholar
Ouyang, H. T., Shih, S. S. & Wu, C. S. Optimal combinations of non-sequential regressors for ARX-Based typhoon inundation forecast models considering multiple objectives. Water 9, 519 (2017).
Article Google Scholar
Wang, Z. S., Wang, C. Y., Ding, L. H., Wang, Z. & Liang, S. N. Parameter identification of fractional-order time delay system based on Legendre wavelet. Mech. Syst. Signal Process. 163, 108141 (2021).
Article Google Scholar
Li, J. P., Hua, C. C., Tang, Y. G. & Guan, X. P. A time varying forgetting factor stochastic gradient combined with kalman filter algorithm for parameter identification of dynamic systems. Nonlinear Dyn. 78, 1943–1952 (2014).
Article Google Scholar
Dong, R. Q., Zhang, Y. & Wu, A. G. Weighted hierarchical stochastic gradient identification algorithms for ARX models. Int. J. Syst. Sci. 52, 363–373 (2021).
Article ADS MathSciNet Google Scholar
Li, F., Zheng, T., He, N. B. & Cao, Q. F. Data-driven hybrid neural fuzzy network and ARX modeling approach to practical industrial process identification. IEEE/CAA J. Autom. Sin. 9, 1702–1705 (2022).
Article Google Scholar
Tu, Q., Rong, Y. J. & Chen, J. Parameter identification of ARX Models based on modified momentum gradient descent algorithm. Complexity 2020, 9537075 (2020).
Article Google Scholar
Jing, S. X. Identification of an ARX model with impulse noise using a variable step size information gradient algorithm based on the kurtosis and minimum Renyi error entropy. Int. J. Robust Nonlinear Control 32, 1672–1686 (2022).
Article MathSciNet Google Scholar
Li, F., Zhu, X. J. & Cao, Q. F. Parameter learning for the nonlinear system described by a class of Hammerstein models. Circuits Syst. Signal Process. 42, 2635–2653 (2023).
Article Google Scholar
Liang, S. N., Xiao, B., Wang, C. Y., Wang, Z. S. & Wang, L. Multi-Innovation nesterov accelerated gradient parameter identification method for autoregressive exogenous models. J. Vibr. Control https://doi.org/10.1177/10775463231207117 (2023).
Article Google Scholar
Li, F., Li, J. & Peng, D. G. Identification method of neuro-fuzzy-based Hammerstein model with coloured noise. IET Control Theor. Appl. 11, 3026–3037 (2017).
Article MathSciNet Google Scholar
Ding, J., Cao, Z. X., Chen, J. Z. & Jiang, G. P. Weighted parameter estimation for Hammerstein nonlinear ARX systems. Circuits Syst. Signal Process. 39, 2178–2192 (2020).
Article Google Scholar
Chen, J., Huang, B., Zhu, Q. M., Liu, Y. J. & Li, L. Global convergence of the EM algorithm for ARX models with uncertain communication channels. Syst. Control Lett. 136, 104614 (2020).
Article MathSciNet Google Scholar
Jing, S. X. Identification of the ARX model with random impulse noise based on forgetting factor multi-error Information Entropy. Circuits Syst. Signal Process. 41, 915–932 (2022).
Article PubMed Google Scholar
Li, F., Qian, S. Y., He, N. B. & Li, B. Estimation of Wiener nonlinear systems with measurement noises utilizing correlation analysis and Kalman filter. Int. J. Robust Nonlinear Control 34, 4706–4718 (2024).
Article MathSciNet Google Scholar
Chen, J., Zhu, Q. M. & Liu, Y. J. Modified kalman filtering based multi-step-length gradient iterative algorithm for ARX models with random missing outputs. Automatica 118, 109034 (2020).
Article MathSciNet Google Scholar
Li, F., Liang, M. J., He, N. B. & Cao, Q. F. Separation identification approach for the Hammerstein-Wiener nonlinear systems with process noise using correlation analysis. Int. J. Robust Nonlinear Control 33, 8105–8123 (2023).
Article MathSciNet Google Scholar
Li, F., Sun, X. Q. & Cao, Q. F. Parameter learning of multi-input multi-output Hammerstein system with measurement noises utilizing combined signals. Int. J. Adapt. Control Signal Process. https://doi.org/10.1002/acs.3857 (2024).
Article Google Scholar
Stojanovic, V., Nedic, N., Prsic, D. & Dubonjic, L. Optimal experiment design for identification of ARX models with constrained output in non-Gaussian noise. Appl. Math. Model. 40, 6676–6689 (2016).
Article MathSciNet Google Scholar
Ding, F., Lv, L., Pan, J., Wan, X. K. & Jin, X. B. Two-stage gradient based iterative estimation methods for controlled autoregressive systems using the measurement data. Int. J. Control Autom. Syst. 18, 886–896 (2020).
Article Google Scholar
Wang, S. J. & Ding, R. Three-stage recursive least squares parameter estimation for controlled autoregressive autoregressive systems. Appl. Math. Model. 37, 7489–7497 (2013).
Article MathSciNet Google Scholar
Chen, J. & Ding, F. Modified stochastic gradient identification algorithms with fast convergence rates. J. Vibr. Control 17, 1281–1286 (2011).
Article MathSciNet Google Scholar
Chen, J., Liu, Y. J., Ding, F. & Zhu, Q. M. Gradient-based particle filter algorithm for an ARX model with nonlinear communication output. IEEE Trans. Syst. Man Cybern.-Syst. 50, 2198–2207 (2020).
Article Google Scholar
Li, F., Yang, Y. S. & Xia, Y. Q. Identification for nonlinear systems modelled by deep long short-term memory networks based Wiener model. Mech. Syst. Signal Process. 220, 111631 (2024).
Article Google Scholar
Jia, T. Fractional gradient descent algorithm for switching models using self-organizing maps: One set data or all the collected data. Chaos Solitons Fractals 172, 113460 (2023).
Article MathSciNet Google Scholar
Cao, Y. & Su, S. Fractional gradient descent algorithms for systems with outliers: A matrix fractional derivative or a scalar fractional derivative. Chaos Solitons Fractals 174, 113881 (2023).
Article MathSciNet Google Scholar
Chen, Y. Q., Gao, Q., Wei, Y. H. & Wang, Y. Study on fractional order gradient methods. Appl. Math. Computat. 314, 310–321 (2017).
Article MathSciNet Google Scholar
Wei, Y. H., Kang, Y., Yin, W. D. & Wang, Y. Generalization of the gradient method with fractional order gradient direction. J. Franklin Inst. 357, 2514–2532 (2020).
Article MathSciNet Google Scholar
Xu, T. Y., Chen, J., Pu, Y. & Guo, L. X. Fractional-based stochastic gradient algorithms for time-delayed ARX models. Circuits Syst. Signal Process. 41, 1895–1912 (2022).
Article Google Scholar
Chaudhary, N. I., Raja, M., Khan, Z., Mehmood, A. & Shah, S. M. Design of fractional hierarchical gradient descent algorithm for parameter estimation of nonlinear control autoregressive systems. Chaos Solitons Fractals 157, 111913 (2022).
Article Google Scholar
Fernandez, A. & Fahad, H. M. Weighted fractional calculus: A general class of operators. Fractal Fract. 6, 208 (2022).
Article Google Scholar
Yang, Y., Mo, L., Hu, Y. & Long, F. The improved stochastic fractional order gradient descent algorithm. Fractal Fract. 7, 631 (2023).
Article Google Scholar
Liu, L. J., Ding, F., Wang, C., Alsaedi, A. & Hayat, T. Maximum likelihood multi-innovation stochastic gradient estimation for multivariate equation-error systems. Int. J. Control Autom. Syst. 16, 2528–2537 (2018).
Article Google Scholar
Ding, F. Several multi-innovation identification methods. Digit. Signal Process. 20, 1027–1039 (2010).
Article ADS Google Scholar
Zhang, Q., Wang, H. W. & Liu, C. L. Hybrid identification method for fractional-order nonlinear systems based on the multi-innovation principle. Appl. Intell. 53, 15711–15726 (2023).
Article Google Scholar
Ding, F. & Chen, T. W. Performance analysis of multi–innovation gradient type identification methods. Automatica 43, 1–14 (2007).
Article MathSciNet CAS Google Scholar
Cheng, S. S., Wei, Y. H., Chen, Y. Q., Li, Y. & Wang, Y. An innovative fractional order LMS based on variable initial value and gradient order. Signal Process. 133, 260–269 (2017).
Article ADS Google Scholar

Download references

Acknowledgements

The manuscript was supported by the Education Department of Jilin Province (Grant No. JJKH20240314KJ) and Information Perception and Intelligent Control Laboratory.

Author information

Authors and Affiliations

School of Information and Control Engineering, Jilin Institute of Chemical Technology, Jilin, 132022, China
Zishuo Wang, Shuning Liang, Beichen Chen & Hongliang Sun
School of Energy and Power Engineering, Northeast Electric Power University, Jilin, 132012, China
Beichen Chen

Authors

Zishuo Wang
View author publications
Search author on:PubMed Google Scholar
Shuning Liang
View author publications
Search author on:PubMed Google Scholar
Beichen Chen
View author publications
Search author on:PubMed Google Scholar
Hongliang Sun
View author publications
Search author on:PubMed Google Scholar

Contributions

Zishuo Wang: Conceptualization, Methodology, Writing—Original Draft, Funding acquisition. Shuning Liang: Analysis of the numerical simulation example. Beichen Chen: Construction of the experimental platform and data acquisition. Hongliang Sun: Supervision. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zishuo Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, Z., Liang, S., Chen, B. et al. Additional fractional gradient descent identification algorithm based on multi-innovation principle for autoregressive exogenous models. Sci Rep 14, 19843 (2024). https://doi.org/10.1038/s41598-024-70269-x

Download citation

Received: 06 April 2024
Accepted: 14 August 2024
Published: 27 August 2024
DOI: https://doi.org/10.1038/s41598-024-70269-x

Additional fractional gradient descent identification algorithm based on multi-innovation principle for autoregressive exogenous models

Subjects

Abstract

Similar content being viewed by others

Fractional order system identification using a joint multi-innovation fractional gradient descent algorithm

High performance adaptive step size fractional numerical scheme for solving fractional differential equations

On the analysis and deeper properties of the fractional complex physical models pertaining to nonsingular kernels

Introduction

Autoregressive exogenous models

Additional fractional gradient descent identification algorithm

Multi-innovation additional fractional gradient descent identification algorithm and convergence analysis

Multi-innovation additional fractional gradient descent identification algorithm

Convergence analysis

Lemma 1

Lemma 2

Lemma 3

Proof

Simulation and experiment

Numerical simulation

Experiment

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Fractional order system identification using a joint multi-innovation fractional gradient descent algorithm

High performance adaptive step size fractional numerical scheme for solving fractional differential equations

On the analysis and deeper properties of the fractional complex physical models pertaining to nonsingular kernels

Introduction

Autoregressive exogenous models

Additional fractional gradient descent identification algorithm

Multi-innovation additional fractional gradient descent identification algorithm and convergence analysis

Multi-innovation additional fractional gradient descent identification algorithm

Convergence analysis

Lemma 1

Lemma 2

Lemma 3

Proof

Simulation and experiment

Numerical simulation

Experiment

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links