Online learning fuzzy echo state network with applications on redundant manipulators

1 Introduction

To improve production efficiency and set themselves free from manpower, robots have come into being and undergone expeditious and substantial progress, with plentiful and triumphant applications in numerous areas (Sun et al., 2023b; Liu et al., 2024). Therefore, redundant manipulators that possess more degrees of freedom (DOFs) than non-redundant ones to fulfill a specific task stand out and have been subject to in-depth and comprehensive investigations (Liao et al., 2016; Liu et al., 2023). More precisely, by virtue of the additional DOFs, they are capable of executing some secondary tasks while performing the primary task, such as obstacle avoidance, optimizing joint torques, and enhancing operability (Jin et al., 2017a; Sun et al., 2022a). For that reason, research on the mechanisms and applications of redundant manipulators is in full swing. However, it is worth mentioning that the additional DOFs result in troubles and challenges for controlling manipulators efficiently and precisely (Zhang et al., 2019; Zhao et al., 2020). Therefore, it imports the demand to devise and construct a potent control scheme of redundant manipulators (Jin et al., 2017b; Liao et al., 2022).

With a sophisticated and ingenious nervous system, humans are capable of performing a variety of complicated and intractable missions by learning from recent experiences, which is the most prominent difference and superiority compared with other creatures (Wang et al., 2016; Liao et al., 2024b). Therefore, this has opened up a new avenue for the control of manipulators. That is, manipulators can accomplish the assigned task with high efficiency by simulating the learning ability of humans. Taking the neural network (NN) (Su et al., 2023a; Wei and Jin, 2024) and fuzzy inference system (FIS) (Vargas et al., 2024) into account, both of them attempt to simulate the thinking and decision-making processes of humans in a certain way. Therefore, they have garnered the attention of researchers, and a lot of effort has been put into integrating them with manipulator control systems to improve the completion of the task and meet the requirements of different scenarios. For instance, Yoo and Ham (2000) present adaptive control schemes for manipulators, in which the parameter uncertainty is handled via the FIS. Afterward, aiming at the tracking control of the end-effector for manipulators, an FIS-based controller is designed by Yilmaz et al. (2022), in which the centers and widths of the membership functions are adjusted adaptively, thus promoting the learning power of the controller. Recently, Yilmaz et al. (2023) devised an FIS-based output-feedback controller for the joint space tracking of manipulators, in which the demands for joint velocity and knowledge of manipulators are eliminated.

In recent times, a surge of research has come into view in the realm of the echo state network (ESN), a sort of recurrent neural network (RNN), which overcomes certain problems hindering the investigations and applications of RNNs, such as gradient vanishing and gradient exploding (Rodan and Tino, 2011; Chen et al., 2023). The core of ESN lies in the reservoir, which is a large, sparse network in charge of capturing the dynamic behavior of input information. Particularly in the ESN, both input and reservoir weights are generated at random, and one needs to put effort into obtaining the output weights by figuring out the weighted sum of outputs (Lukoševičius, 2012). Considering another network, the extreme learning machine (ELM) (Huang et al., 2006) is a feedforward network with a hidden layer. Weights and biases for the hidden layer are appointed randomly, while the training of the network focuses on determining output weights through the least squares method. Therefore, from the perspective of this point, the ELM, ESN, and FIS share a certain similarity, and thus, a great deal of work has been carried out that builds and verifies the bridges between them (Sun et al., 2007; Ribeiro et al., 2020). By integrating these networks and taking advantage of their strengths, some extraordinary work is presented and utilized in various domains to address different issues. Concentrating on function approximation and classification problems, a fuzzy ELM with the capacity for online learning was devised by Rong et al. (2009). Compared with other existing mechanisms it presents remarkable superiority with decent accuracy and reduced training time. Motivated by this, aiming at efficient control of redundant manipulators, this study proposes an online learning fuzzy ESN (OLFESN). To be more specific, the proposed OLFESN is designed, based on an online learning strategy for ESN, to erect an efficient control scheme for redundant manipulators, while the FIS is also incorporated to improve the accuracy and efficiency of the proposed network. Then, a corresponding control scheme for redundant manipulators is constructed. The rest of this study is organized as follows: Section 2 makes known some preliminary steps to lay the foundation for this study. Then, the OLFESN is proposed, based on which the control scheme for redundant manipulators is devised in Section 3. In Section 4, simulations and experiments are carried out to investigate the feasibility and effectiveness of the proposed control scheme. In the end, Section 5 concludes this study.

2 Preliminaries

In this section, the forward kinematics of redundant manipulators, the Takagi–Sugeno–Kang (TSK) fuzzy system, and ESN are briefly reviewed, which are the bases of the proposed OLFESN.

2.1 Forward kinematics of redundant manipulators

The forward kinematics equation that depicts the non-linear transformation of redundant manipulators from the joint angle q ∈ ℝa to the Cartesian position r ∈ ℝb with a > b can be depicted as

where ϒ(•) signifies the non-linear mapping function, which depends upon the structural properties of redundant manipulators (Sun et al., 2022b; Zhang et al., 2022). Where after, evaluating the derivative of Equation (1) in terms of time contributes to

in which J(q) = ∂ϒ(q)/∂q ∈ ℝb × a denotes the Jacobian matrix; q. denotes the angular velocity; r. denotes the velocity of the end-effector (Yan et al., 2024). Heretofore, the non-linear transformation (Equation 1) is converted to the affine system (Equation 2) with the convenience of gaining the redundancy solution of redundant manipulators (Sun et al., 2023a).

2.2 Takagi–Sugeno–Kang fuzzy system

In the TSK fuzzy system with given input α=[α1;α2;⋯;αm]∈ℝm, the k-th rule can be depicted as Kerk et al. (2021) and Zhang et al. (2023):

Rule k:IF α1is A1k, α2 is A2k,⋯,αmis Amk,THEN χk=β0k+β1kα1+β2kα2+⋯+βmkαm, (3)

where k=1,2,⋯,k~ is the index of the fuzzy rule with k~ being the number of fuzzy rules; Amk denotes the fuzzy subset of the m-th element of input α in the k-th rule; χk signifies the output of the k-th rule; βm~k(m~=0,1,⋯,m) is the consequent coefficient of the k-th rule. Considering the m-th element of input in the k-th rule, the degree to which it matches the fuzzy subset Amk is measured by its membership function ζAmk(αm), which can be any bounded non-constant piecewise continuous function (Rezaee and Zarandi, 2010). Let ⊗ denote the fuzzy conjunction operation, and then the firing strength (if part) of the k-th rule is defined as

Ok(α,pk)=ζA1k(α1,p1,k)⊗ζA2k(α2,p2,k)⊗⋯⊗ζAmk(αm,pm,k), (4)

where pk is the parameter of membership function ζ(•) in the k-th rule. Normalizing (Equation 4), there is

Ψ(α,pk)=Ok(α,pk)∑k=1k~Ok(α,pk). (5)

Ultimately, for the input α, the output of the TSK fuzzy model can be obtained as

y˜=∑k=1k˜θkOk(α,pk)∑k=1k˜Ok(α,pk)=∑k=1k˜θkΨ(α,pk), (6)

with θk = (θk1, θk2, ⋯ , θkm).

2.3 Echo state network

The ESN is composed of an input layer, a reservoir, and an output layer, which enjoy l, r, and o neurons, respectively (Calandra et al., 2021). For a complete network, the input layer, reservoir, and output layer re connected by input weights Win∈ℝr×l and output weights Wout∈ℝo×r, respectively, while the internal neurons of the reservoir are connected to each other by dint of Wres∈ℝr×r (Chen et al., 2024a). In particular, the spectral radius of Wres needs to be < 1 to capture the echo state property. At the time of step i, designate input and reservoir states as xi=[x1;x2;⋯;xl]∈ℝl and ιi=[ι1;ι2;⋯;ιr]∈ℝr, respectively. The reservoir is updated through

ιi=f(Winxi+Wresι(i-1)), (7)

and the output of the network is

yi=g(Woutιi), (8)

with yi=[y1;y2;⋯;yo]∈ℝo. Furthermore, for working out the output weights, keep track of reservoir state and outputs in matrices Λ=[ι1,ι2,⋯,ιi˜]∈ℝr×i˜ and Y=[y1,y2,⋯,yi˜]∈ℝo×i˜, respectively, during training, where ĩ denotes the number of training samples. Where after, by solving

minWout:||Y-WoutΛ||2,2 (9)

the output weights are obtained

Wout=YΛT(ΛΛT)-1 (10)

where the superscripts T and −1 represent transpose and inversion operations of a matrix, respectively (Su et al., 2023b; Liao et al., 2024a).

3 Online learning fuzzy echo state network

Stimulated by the commonalities between ESN and FIS, OLFESN is proposed in this section. Then, an OLFESN-based control scheme for redundant manipulators is devised.

3.1 OLFESN

Considering (Equation 4), the firing strength (if any) in the TSK fuzzy system involves multiple fuzzy conjunction operations, providing sufficient computing power for thoroughly exploring and utilizing input information. Furthermore, each rule is normalized to ensure that different rules have a comparable contribution to the system. Similarly, in the ESN, it is the reservoir that is responsible for implementing the above function, by which the low-dimensional input is mapped to a high-dimensional dynamic space. In addition, the outputs of different reservoirs are adjusted to the same extent with the aid of the activation function f(•), which plays the same role as Equation (5). Therefore, the reservoir is adopted to reveal the firing strength normalized in the proposed OLESN. Specifically, the OLESN with k~ reservoirs is established as follows:

Given training samples ℶ=i=1i˜, the state of the k-th reservoir is updated via

ιki=fk(Winxi+Wresιk(i−1)), i=1,2,⋯,i˜, (11)

where fk(•) denotes the activation function of the k-th reservoir, and ĩ is the number of training samples. Collect all states of the k-th reservoir in Ξk = [ιk1, ιk2, ⋯ , ιkĩ, and then integrate all k~ reservoirs elicited

Λ=Ξ1Ξ2⋯Ξk~. (12)

Thus, the output of the fuzzy ESN (FESN) can be formulated as

with Y = [y1, y2, ⋯ , yĩ]. Similarly to Equation (10), output weights are obtained via

Wout=YΛT(ΛΛT)-1. (14)

At this point, the derivation of FESN is complete. Therewith, taking into account the need for online learning, the OLFESN is proposed, which incorporates the FESN and the online learning strategy for ESN. To be more specific, when data shows up constantly, the OLFESN is summarized as follows:

3.2 Initialization phase

a. Given the initial training samples ℶ0=i=1i˜0, update and transcribe the state of all k~ reservoirs using Equation 11.

b. Taking advantage of Equation 12, figure out the initial state matrix Λ0 for FESN.

c. Compute the initial output weights Wout0=T0Λ0TY0 with T0=(Λ0TΛ0)-1 and Y0 = [y1, y2, ⋯ , yĩ0].

d. Let p = 0.

3.3 Sequential learning phase

a. With the new sample set

ℶp+1=i=(∑j=0pi˜j)+1∑j=0p+1i˜j,

solve problem

||Woutp+1[Λp, Λp+1 ]-[Yp, Yp+1 ]||22, (15)

where ĩp+1 signifies the count of samples in the (p+1)-th set; Λp+1is the corresponding reservoir state, obtained by Equations 11, 12; Yp+1 =[y(∑j=0pi˜j)+1,⋯,y(∑j=0p+1i˜j)+ 1].

b. Let Ψp=Hp-1 with HP=[Λp, Λp+1 ][Λp, Λp+1]T.

c. Update output weights

Ψp+1=Ψp-ΨpΛp+1 (I+Λp+1TΨpΛp+1 )-1Λp+ 1TΨp,Woutp+1=Woutp+(Yp+1 -WoutpΛp+1 ) Λp+1TΨp+1. (16)

d. Let p = p + 1. (Back to step 2).

Remark 1: For the case that the new samples come out one by one, with the aid of the Sherman-Morrison formula (Chen et al., 2024b), Equation 16 is further simplified as

Ψp+1=Ψp-Ψpιp+1ιp+1TΨp1+ιp+ 1TΨpιk1,Woutp+1=Woutp+(yp+1 -Woutpιp+1 ) ιp+1TΨp+1. (17) 3.4 OLFESN-based control scheme

In this section, an OLFESN-based control scheme for redundant manipulators is developed for performing the given missions. At moment t, define θa (t) and Δθa (t) as the actual joint angle and actual joint angle increment, respectively. Meanwhile, the actual and desired positions of the end-effector are denoted by ζa (t) and ζd (t), respectively. Correspondingly, at moment t+1, the desired position increment for the end-effector is expressed as Δζ(t+1) = ζd(t)−ζa(t). Incorporate θa(t), Δθa(t), and Δζ(t + 1), which is the input of the OLFESN and denoted by x(t) for the convenience of subsequent expressions. Then, applying Equations 11–13, we gain the joint angle increment Δθa(t + 1) for the next moment, i.e., the output of OLFESN. Hence, the control signal for the next moment is acquired, i.e., θa(t+1) = θa(t) + Δθa(t + 1 ).

Note that, in the OLFESN,

View original article

FRONTIERS IN NEUROROBOTICS

Like

分享书签

0 0 0 0 0 0 0

More from this channel

Online learning fuzzy echo state network with applications on redundant manipulators

留言 (0)