Brain-inspired biomimetic robot control: a review

1 Introduction

The field of robotics is advancing toward increasingly sophisticated robots that are able to perform tasks previously reserved only for humans due to their high complexity, especially when performed in unstructured environments. Such tasks range from human-robot interaction with compliance and motion constraints, to handling and manipulating objects of arbitrary shapes and materials in a dexterous manner, to legged-robot navigation in challenging environments. In many of these tasks, the robots being developed feature new structural materials and ways of actuation, and often present a high number of degrees of freedom. These include anthropomorphic musculoskeletal robotic systems (Diamond et al., 2012; Asano et al., 2017), soft-robotic arms and grippers (Cianchetti et al., 2018; Walker et al., 2020) and other robots such as walking robots (Coelho et al., 2021; Lyashenko et al., 2021). Novel ways of actuation include artificial muscles (Carpi, 2016; Mirvakili and Hunter, 2018). The high dimensionality and non-linearities present in these systems as well as the increasing complexity of the tasks the robots need to perform, pose a challenging control problem.

Conventional model-based control approaches guarantee strong stability properties of the controlled system and prescribed accuracy, even in the presence of structured and unstructured uncertainties. However, their design complexity scales very poorly with dimensionality and, therefore are difficult to generalize, maintain and tune in connection to complex robot tasks. On the other hand, relying on model-free or learning-based solutions, such as machine learning and statistical modeling methods can efficiently manage extensive system dimensions. Yet, they come with heavy computationally demands, struggle with adaptability to different scenarios and lack assurance in stability and robustness.

Taking inspiration from biology, where humans and animals are able to gracefully and efficiently perform complex motion tasks, the scientific community of robotics has started pursuing research on new control strategies that are based on biological learning principles and architectures. More specifically, control schemes that reflect brain structures relevant to motor control have recently become central in pursuing efficient adaptive control of complex motion systems. These schemes are often referred to as Brain-Inspired Control. Brain-inspired control is included in the broader category of bio-inspired control, which accounts for any control method drawing inspiration from biology. Often bio-inspired control may only approximate biology vaguely and on a high abstraction level. The methods that more closely model the working principles of biological systems, in this case the brain, are referred to as biomimetic control (BC) methods. Figure 1 shows a diagram with this classification.

www.frontiersin.org

Figure 1. Model-free and data-driven controllers categorization. Model-free controllers may work without a model (Fliess and Join, 2013) or rely on data to obtain it. Bio-inspired control draws inspiration from biology to different degrees to design the controller, and brain-inspired control focuses specifically on methods inspired by the brain. When these methods closely mimic the processes and structures in the brain, they are referred to as biomimetic controllers, and they can just model the overall behavior with various function-approximation methods (functional approach), or the neuronal circuits and cells that give rise to such behavior (cellular approach). This review focuses on biomimetic control (BC).

This study presents a review of BC methods applied to robotics. Specifically, it focuses on those methods that replicate function, structure or cellular-level processes from certain brain areas involved in motor control, and some of their interconnections. The remainder of the paper has the following structure: Section 2 provides an overview of conventional control strategies for robotics, classifying them according to their use of analytical vs. data-driven modeling approaches; Section 3 delves deeper into the core concepts of brain-inspired controllers; Section 4 presents the most relevant works on BC on recent decades and classifies them by the brain areas they model and the robotics tasks they address, and finally Section 5 closes with some remarks and possible future trends.

2 Overview of robot control methods

In general, the standard procedure for control systems design can be divided into two steps: model identification and controller design. For complex systems, the model identification phase is usually the most laborious one since it requires extensive knowledge of the equations describing the system. The controller design phase can vary in difficulty depending on the requirements of the control task and the level of non-linearity of the controlled system, which leads to the choice of simpler or more advanced controllers.

Depending on the level of use of analytical modeling vs. empirical data measurements to fit a model, controllers can be classified into several categories. We distinguish four main ones: model-based controllers, model-free or data-driven controllers and hybrid controllers. Model-based controllers follow the standard two-step approach and fully rely on analytical knowledge of the system; model-free or data-driven controllers may not use a model at all or rely on data to obtain it, and may also include the controller design phase in the data-driven modeling procedure; hybrid controllers use empirical data but impose some constraints on the model which are usually informed by physics or related to the control task. Biologically or bio-inspired controllers may be included in the data-driven category and in some cases in the hybrid controllers one.

Next, a brief overview of different existing methods for the presented controller categories is introduced, showing their strengths and weaknesses and motivating the research interest in bio-inspired controllers.

2.1 Model-based control

Conventional model-based controllers face several challenges when applied to novel robotics systems (difficulty in obtaining analytical model, hard to tune, poor scalability), however some solutions have been proposed since they can be useful given certain simplifying assumptions and for certain tasks and scenarios. For humanoid, musculoskeletal, and walking robots, part of the challenge is in scaling for the many degrees of freedom, which implies that conventional analytical inverse kinematics and dynamics modeling can be used given enough computing power. Qiao et al. (2023) presents an overview of several control strategies, including model-based ones, for controlling musculoskeletal robots. For hexapod walking robots, Coelho et al. (2021) presents several kinematics and dynamics-based methods for control. For quadruped and bipedal conventional control methods, refer to De Santos et al. (2006) and Westervelt et al. (2018). In the case of soft robotics, the physical equations of motion are not trivial to model since the robots deform in a continuous way across their length and use novel actuation methods with significant nonlinearities. Different approximations exist, modeling the dynamics and kinematics to different degrees, trading between accuracy and efficiency. Santina et al. (2023) shows the state of the art of model-based control for soft-robotic systems.

2.2 Model-free and data-driven control

Model-free controllers have been gaining traction in the field of robotics in recent years for removing the need for a tedious analytical modeling phase in complex systems. Some literature (Fliess and Join, 2013) refers to model-free control as the set of methods that do not use any model to perform control, although they may have adaptive gains. The methods that use a model obtained through data, are more correctly referred to as data-driven controllers. These data-derived models may encompass first principles (Fasel et al., 2021) or can be black-box input/output representations. In this paper we do not make a hard distinction between the different variations of model-free controllers and we will use this term to refer to all of the model-free types.

The most relevant data-driven methods used for control can be classified into two main families of algorithms: machine learning (ML) and statistical modeling. Statistical modeling introduces fundamental assumptions about the data distribution, enhancing interpretability but constraining the level of complexity that the obtained models can represent. Some statistical modeling techniques used in robotics include (Vijayakumar et al., 2005; Nguyen-Tuong et al., 2009), with applications in novel robotic systems such as soft robots (Tang et al., 2022).

Machine learning algorithms include, among other methods, supervised learning and reinforcement learning (RL), which are the two most commonly used approaches for learning in robotics. The work in Singh et al. (2022) presents a survey of RL methods for general robotics systems, while Wang et al. (2021) presents machine learning-based control methods for soft robotics. Supervised learning can be used to learn a dynamics model of the robot and environment which is then used by an adaptive control policy in a model-based RL setting (Polydoros and Nalpantidis, 2017; Zhang and Mo, 2021).

In recent years, research in robotics has predominantly focused on ML algorithms, particularly on artificial neural networks (ANNs). ANNs excel in handling complex data but demand large data volumes, posing cost challenges. Moreover, they struggle with quick online adaptation, suffering from issues like catastrophic forgetting and slow adjustment to changes in the robot or environment (Kirkpatrick et al., 2017). Some ML (Hoi et al., 2021) and statistical modeling techniques (Vijayakumar et al., 2005; Nguyen-Tuong et al., 2009) offer online learning capabilities, however they are intricate to fine-tune and less effective with higher task complexity. Overall, these data-driven methods lack formal guarantees in robustness and stability due to their black-box nature, impeding mathematical analysis and limiting extrapolation beyond training data.

2.3 Hybrid control

Often data-driven methods can benefit from partial knowledge of the system which can be used to assist or reduce the extent of the learning task. Alternatively, some model-based control architectures can use a model that is obtained or tuned through data-driven methods. This gives rise to hybrid control methods, which combine model-based and model-free techniques to complement each other's shortcomings.

A common modeling choice for systems with complex nonlinear dynamics is to approximate their behavior with a set of physics-informed nonlinear dynamics equations whose coefficients are obtained in a data-driven way. Afterwards, a nonlinear control system can be built around this model, taking advantage of the equations obtained. A popular modeling technique in this category is the Sparse Identification of Nonlinear Dynamics (SINDy) (Brunton et al., 2016). It uses sparse regression to identify the most relevant terms in a library of candidate nonlinear functions, resulting in a concise model that captures the essential dynamics of the system, making it popular for its interpretability and computational efficiency. Some examples of robotics applications of SINDy are found in Chen et al. (2021) for trajectory tracking and in Bhattacharya et al. (2020) for soft-robot modeling.

Other hybrid controller methods include data-driven model predictive control (Berberich et al., 2021), which has a broad application in robotic systems, and the work in Reinhart et al. (2017), which was used to control a soft-robot arm.

Overall, the area of hybrid controllers is a relatively unexplored line of research that holds promising results for enhancing interpretability of data-driven methods while retaining good stability and robustness properties from model-based approaches.

3 Brain-inspired control paradigms

In the domain of model-free controllers, a different approach emerges by delving deeper into biological paradigms. Animals, and particularly humans, are capable of performing advanced motion tasks dexterously, learning new behaviors efficiently, and adapting to new physical situations and changes in the environment. This is realized by highly evolved brains, and specifically by their sensorimotor brain areas. These areas address different functions in motion control and complement each other, presenting a connective structure and activity that can be studied and modeled. While neuroscience still holds many mysteries, our current understanding of certain brain regions provides ample inspiration for developing novel computational methods capable of emulating their functionalities. By closely mimicking their operational principles, it becomes possible to attain their desired attributes, including sparse and efficient computations, lifelong learning and online adaptation (DeWolf, 2021). This approach holds the promise of resolving many challenges inherent in prevalent data-driven algorithms like ANN.

Brain-inspired control algorithms vary in their approximation of biology, spanning a spectrum from replicating solely high-level processes or functions to simulating the intricacy of neurons and neural circuits found within motor areas. On the highest abstraction level, some methods that are based on different learning approaches can be included, such as iterative learning (Wang et al., 2009) and active inference (Pezzato et al., 2020). ANNs for control and RL methods are also brain-inspired, however they only represent a very coarse approximation of the brain structure (ANNs), or a high level behavioral process (RL). Control approaches using these learning methods are excluded from this review, with the exception of some RL cases which are framed in the context of a specific motor brain area (e.g., the basal ganglia—BG).

This review focuses on the works that attempt to closely replicate the processes and structures of the motor brain areas. The controllers based on these methods are also referred to as BCs. There are two main approaches to mimicking the brain motor areas: the functional and the cellular ones. This was a distinction introduced for cerebellar models (Luque et al., 2014), and we extend it to other brain areas.

The functional point of view identifies the substructures of each motor area and aims to replicate their behavior with generic conventional learning or function approximation methods. Then it establishes the relationships between these substructures with the proper connections to mimic the transfer of information taking place in the biological counterpart. This approach does not harness the full potential of biological networks since the function approximation methods present their own limitations. Generally, this approach can be used to validate neuroscientific hypotheses concerning high level processes, e.g., evaluating the role of specific motor areas as a whole or their relationships with other areas.

The cellular point of view, also referred to as bio-plausible in this review, seeks to replicate the lower-level mechanisms of the brain motor areas by modeling these down to individual cells and microstructures. Models of neurons create the basis for building the emulated brain circuits, which are implemented by establishing the required synaptic connections. The implementation of cellular-level models still represents an approximation of the whole biochemistry involved in the neuronal processes, but gives rise to a desired behavior which is useful for robotics control. Additionally, this approach allows for testing neuroscientific hypotheses with finer detail than the functional modeling approach (Tolu et al., 2023), addressing not only the role of the emulated areas but also of more specific low-scale groups of neurons and their connections.

To learn and represent functions and behaviors, the bio-plausible neuron models are connected in a similar way as traditional artificial neural networks, but with inhibitory and excitatory connections, and communicate via spike signals. This distinctive feature gives rise to what are known as spiking neural networks (SNNs) (Maass, 1997; Ghosh-Dastidar and Adeli, 2009). Due to the spiking nature of the neurons, these networks present several desirable characteristics also present in biological networks conforming animal brains:

• The spike-based communication between neurons is a highly efficient way to convey information through the network, and this becomes evident when implemented in energy-efficient neuromorphic hardware.

• The network is sparsely activated, which means that only a small subset of the neurons are active at a given time, thus reducing the energy consumption further.

• The dynamic behavior of the neurons makes SNN a great candidate to represent time-dependent data, such as dynamic models in robotics.

Several studies have explored biologically plausible learning methods for spiking neural networks (SNNs) using spike-timing-dependent plasticity (STDP) (Feldman, 2012). These include unsupervised STDP-based models with adaptive mechanisms (Dong et al., 2023), supervised learning methods combining STDP with synaptic scaling and intrinsic plasticity (Hao et al., 2020), and online-learning models for hardware implementation (Qiao et al., 2019). These works aim to bridge the gap between biologically plausible approaches and backpropagation-based methods while providing insights into how learning occurs in biological systems. Taherkhani et al. (2020) show an overview of biologically plausible learning methods for SNNs.

Some other studies have explored alternative approaches to make SNN training and deployment more efficient and robust, albeit not retaining complete biological plausibility. Tavanaei et al. (2019) provide an overview of methods for training deep SNNs, while Kim et al. (2023) analyze temporal information dynamics during training. Yao et al. (2023) introduce an attention module to improve performance and energy efficiency, and Yang et al. (2023a) propose a multi-scale learning rule with dendritic predictive characteristics. Yang and Chen (2023a,b), and Yang et al. (2023b) present information-theoretic learning approaches using nonlinear information bottleneck principles and explore the design space of the information bottleneck framework to improve robustness, accuracy, and power efficiency in SNNs.

For an overview of the uses and properties of SNNs as well as their training methods, refer to Yamazaki et al. (2022) and Pietrzak et al. (2023) respectively.

The cellular-level approach can become computationally expensive depending on the level of biological fidelity and the number of simulated neurons. This is problematic for current common computing hardware such as GPUs and CPUs, but neuromorphic hardware (Young et al., 2019; Rathi et al., 2023) addresses this issue by implementing the behavior of neurons on a physical level or in-silico. This substantially enhances computational power and efficiency. Some neuromorphic platforms that have been developed by semiconductor companies include Intel®Loihi (Davies et al., 2018) with programmable spiking neural network features, IBM®TrueNorth (Akopyan et al., 2015) with 1 million neurons and 256 million synapses. Others have been used for large-scale research projects such as the Human Brain Project (Amunts et al., 2016), including SpiNNaker (Furber et al., 2014) designed for large-scale spiking neural network modeling, and BrainScaleS (Pehle et al., 2022) combining analog spiking neural network emulation with digital components. Several works on robotics have used some of these platforms to different degrees, which will be presented later.

Other research groups have developed alternative neuromorphic platforms that leverage custom mixed-signal circuits and specialized digital architectures to emulate the behavior of biological neurons and synapses with varying levels of abstraction and realism. For example, Yang et al. (2022b) propose a hybrid neuromorphic platform integrating multiple granules of SNNs, demonstating the replication cognitive activities like motor learning and action selection. Yang et al. (2021) focus on a large-scale cerebellar network model and architecture for supervised motor learning, with over 3.5 million neurons, better mimicking the biological cerebellum's structure. Additionally, Yang et al. (2024a) presents a neuromorphic architecture with dendritic on-line learning (NADOL) for brain-inspired intelligence on embedded hardware, exhibiting superior learning capabilities compared to GPU platforms. Emphasizing on fault-tolerance, the works Yang et al. (2022a, 2024b) present neuromorphic frameworks capable of robust learning and decision-making. While Yang et al. (2022a) focuses on context-dependent learning of stimulus-response associations, Yang et al. (2024b) integrates visual perception with decision-making, demonstrating high accuracy and minimal latency.

Neuromorphic computing remains an emerging field, especially concerning large-scale simulations, yet it holds vast potential. As hardware progresses, these challenges are poised to diminish, offering immense promise for the future. Presently, implementations on conventional computing platforms lean toward simplified neuron models, retaining their fundamental characteristics while boosting performance and enabling greater scalability.

This paper predominantly reviews on cellular-level implementations within BCs, as they exhibit considerable promise in replicating brain functions in the future. This potential amplifies as our structural understanding of the brain evolves and neuromorphic hardware advances. However, it also touches upon significant work employing a functional approach, as these initiatives lay the groundwork for future advancements and possess adaptability for accommodating SNNs. The main works on BCs found in literature are presented in the next section.

4 Biomimetic control models

Over the past few decades, the accumulation of evidence concerning how the brain's motor areas function has inspired the creation of computational models. These models aim to mimic their behavior and utilize their characteristics in robotics applications. The upcoming sections detail the latest developments in BCs, striving to merge the exploration and validation of current neuroscientific theories with the practical task of controlling robots, encompassing both simulated environments and real-world scenarios.

Since the works on BCs mostly focus on modeling a single or few brain areas, these sections are organized according to the main areas modeled in the brain motor control hierarchy. Inside these sections, the different robotics tasks addressed are presented. This is depicted in Figure 2, and summarized in Table 1. Nevertheless, some works attempt to model a wider range of brain areas and their interconnections; these are known as systems-level models.

www.frontiersin.org

Figure 2. Visual depiction of the brain motor control areas covered in this review and the main robotics control tasks they specialize in.

www.frontiersin.org

Table 1. Classification of the different works in literature on brain-inspired motor control, sorted by brain areas modeled, robotics tasks addressed, and the approach taken to achieve bio-mimicry (cellular-level or functional).

The brain motor control hierarchy

The brain's motor system comprises specialized areas dedicated to distinct functions in controlling movement. These regions follow a hierarchical arrangement: higher-level domains oversee broader tasks with considerable abstraction, while lower-level segments focus on individual muscles, delivering precise signals tailored to the task's specifics. Complementing this structure are additional side structures (side loops) responsible for regulating signals within the descending pathways of this hierarchical system. For more details about the brain motor control hierarchy refer to Byrne and Dafny (1997).

• At higher levels, the motor cortex encodes movement force and spatial details, with subdivisions like the premotor and supplementary motor areas handling motion anticipation and kinematic information. Simultaneously, the association cortex aids in environmental representation and action selection based on context.

• At lower levels of the hierarchy, the spinal cord coordinates reflexes, contains Central Pattern Generators (CPGs) for rhythmic movements, controls muscles, and manages vital sensory pathways. Meanwhile, the brainstem acts as a central hub, linking the brain to the body's components, overseeing fundamental functions like balance, breathing, and heart rate.

• Within the side-loop areas, the BG determine suitable motor programs, ensuring the execution of appropriate motor actions. Conversely, the cerebellum contributes to balance, posture, movement coordination, motor learning and the refinement of motor skills.

4.1 High-level BC

In the current robotics paradigm, the functions carried out by the motor cortex and the association cortex can be linked to well-studied specific tasks. In the association cortex, this involves trajectory planning and reference generation. Conversely, the tasks relted to the motor cortex encompass spatial and coordinate transformations, along with the creation of inverse dynamics models that convert references into forces or motor commands. Currently, most studies on bio-inspired controllers assume that the references are given and thus exclude the association cortex by replacing it with an analytical trajectory generation module. Moreover, the role of the motor cortex in representing an inverse model is frequently overlooked, often substituted by an analytical controller. This is supplemented by a model within the cerebellum that offers adjustments based on motor or sensory input. Nevertheless, the following works have modeled the functions of the high-level brain motor areas to certain extents.

The authors in DeWolf and Eliasmith (2011) presented a framework for simulating the hierarchical structure of the motor areas, which includes the pre-motor cortex and supplementary motor area to generate high-level reference control signals. However, they did not implement all the areas, and only simulated a model of the motor cortex with SNNs. The system is capable of controlling a simulated 2-DOF arm for a 2-dimensional target reaching task. Some years later, in DeWolf et al. (2016), the same authors introduced the REACH model, a biologically-plausible adaptive hierarchical approach that incorporates the pre-motor cortex. This model generates adaptive dynamical motion primitives to define desired trajectories, controlling a simulated 2-DOF robot arm in tasks such as trajectory tracking and reaching. The primary motor cortex plays a role in learning to model dynamics, supported by the cerebellum. Additionally, it corrects inaccuracies within the robot's Jacobian model. It receives the targets from the pre-motor cortex and the current system state from sensory cortices, and produces low-level signals as motor commands. Iacob et al. (2021) used REACH in a real robot with 3 DOF. They noted that although the tasks are performed successfully, they obtained lower performance than in the original paper. They also proved that the architecture is capable of disturbance rejection. In their work detailed in DeWolf et al. (2023), the REACH model was employed to oversee a 7-DOF simulated robot arm. Notably, the approach did not involve a biomimetic method for generating references. Instead, it was solely utilized to compute joint forces based on desired workspace forces. Throughout these experiments, the authors consistently employed the Neural Engineering Framework (NEF) (Eliasmith and Anderson, 2003), a framework designed for simulating assemblies of spiking neurons.

A functional model of the prefrontal regions was used in Gentili et al. (2012, 2016) to send the desired reaching position. This model learns the relationship between joint and spatial coordinates. In Zahra et al. (2021b), the authors introduced a differential map designed to convert desired velocities within task space into corresponding motor commands within joint space. This mapping is executed through SNNs and is trained offline before task execution. Notably, the same approach is employed by these authors in subsequent works (Zahra et al., 2021a, 2022a,b), where they successfully implemented it to control a real robotic arm. Baladron et al. (2023) used a pre-motor cortex model to generate goal positions, and a motor cortex-basal ganglia loop to select a concrete action. They implemented the algorithm with SNNs to control a simulated arm with 4 DOF. The work discussed in Corchado et al. (2019) implemented an iterative learning controller, specifically a Learning Feedback Controller, designed to simulate the function of the motor cortex. This controller generates control actions by considering the tracking error as a basis for its decision-making process. In Zhang et al. (2023), a recurrent neural network is used as the primary motor cortex that sends motor commands to a musculoskeletal robot, based on several targets to be reached.

Some works use other non-biologically inspired methods but still link them with the cortical areas. For example, in Garrido et al. (2013) and Abadia et al. (2021a), the authors use a conventional trajectory planner with known inverse kinematics and refer to it as the association cortex, and in Garrido et al. (2013), the motor cortex is represented by a recursive Newton-Euler algorithm that provides approximate motor commands given the available inverse dynamic model.

4.2 Low-level BC

The most relevant role for robotics carried out by these areas is rhythm generation. This takes place in the spinal cord by groups of neurons within CPGs. The rhythms generated by CPGs can be modulated through sensory feedback to adapt to different scenarios that demand alteration of gait speed or pattern. CPGs have been mostly applied to legged robots, especially hexapod robots.

Several neuromorphic hardware implementations of CPGs have been proposed. In Cuevas-Arteaga et al. (2017), the authors deployed a spiking CPG in SpiNNaker to control an hexapod robot with different gaits that are chosen based on the visual information obtained from an event camera. Gutierrez-Galan et al. (2020) implemented in SpiNNaker a CPG that can change online between three different gaits to control a hexapod. Polykretis et al. (2020) proposed presented a CPG spiking network on Intel®Loihi to control an hexapod, showing robustness to noise and different speeds.

Regarding works that take a cellular-level approach in modeling CPGs, Spaeth et al. (2020) presented a minimal network of simulated spiking neurons modulated by sensory feedback that achieves nontrivial behaviors in a flexible walking robot. Strohmer et al. (2020) developed a spiking CPG model capable of continuously changing amplitude, frequency, and phase online, which enables adaptation through feedback.

Massi et al. (2019) modeled several brain motor areas through functional approximations, using non-linear oscillators to model a CPG (functional model). They proposed the use of a learning controller during the optimization process of the locomotion parameters to obtain a final controller configuration with better performance on the walking task of a quadruped robot. In Pitchai et al. (2019), the authors combine a functional CPG with a radial basis function network (RBFN) for locomotion learning of a complex beetle-like robot through reinforcement learning. They focus on the role of the RBFN which determines the shape of the motor patterns, and show that the robot travels faster and is more energy-efficient than using only a CPG. In Shao et al. (2022), the authors control the gait of a gecko-inspired robot by using functional CPGs and a RBFN, combined with exteroceptive sensory feedback to evaluate the terrain. This allows the robot to climb tracks with various slopes and bumps/obstacles, establishing a foundation for climbing robots with adaptive capabilities against rough terrains. Jeppesen et al. (2020) control the oscillations of soft robot through a functional CPG with an adaptation mechanism that modulates the amplitude of the signals upon external perturbations. In Schmidt et al. (2021), the authors compared reflexes, functional CPGs, and a combined approach for controlling a biomimetic robot leg. They found pure reflexes outperformed continuously feedback-adapted CPGs for motion stability and energy efficiency, though pure CPGs allow easier signal modulation. Their results indicate combining reflexes and CPGs could be improved by modulating the control signal shape.

Another application of low-level BC and CPGs in robotics can be found in the control of whisker-like structures for robots, mimicking those found in rodents, to expand the sensory capabilities of mobile robots. In this line of research, Pearson et al. (2007) proposed a multidisciplinary project to reproduce the rodent whisker sensor system in a robotic implementation (Whiskerbot). This project replicated the morphology and mechanics of large whiskers, the whiskers movement via a spiking whisker pattern generator (WPG), based on a CPG, and a biologically plausible model of a central nervous system area (specifically the superior colliculus) for sensing and controlling the robot behavior with action selection through a basal ganglia model. It effectively demonstrated the adaptation of the whisking pattern after contact, displayed also by rats. The development of this project was continued with (SCRATCHbot) (Pearson et al., 2010), where the authors increased the number of whiskers and degrees of freedom to test for more complex WPGs and improving the whisker-environment interaction. In a succeeding work, the same authors developed a new whiskering robot (Shrewbot) (Pearson et al., 2011) with improved snout morphology, which allowed to discern between different surface textures by using a statistical classifier on the whiskering sensor data (Sullivan et al., 2012). In simulation, Antonietti et al. (2022) developed a SNN model of the mouse sensorimotor peripheral whisker system, including a CPG, and integrated it into a virtual mouse robot within the Neurorobotics Platform. Together with a cerebellum-inspired controller, they could reproduce active whisking with learning capabilities, matching neural correlates observed in mice.

The research conducted on CPGs for legged-robots, soft-robots and tactile-like (whisker) sensors, has exhibited promising results in terms of both achieving tasks successfully and mirroring the corresponding biological processes. These methods have been effectively deployed in neuromorphic hardware on multiple occasions, showcasing the capacity to replicate the adaptability mechanisms observed in biological systems.

4.3 Side-loop BC 4.3.1 Basal ganglia-based controllers

Research on computational models for the BG in robotics primarily revolves around replicating the mechanisms and sub-regions responsible for action selection based on cortical signals (Gurney et al., 2001; Frank, 2011; Véronneau-Veilleux et al., 2021). These models aim to generate specific actions and dynamics while incorporating learning mechanisms through RL. In the context of comprehensive brain-motor system models intended for physical robots operating in a complex environment, BG models are essential. They enable the RL feature essential for learning and selecting optimal high-level actions to operate effectively within the complex environment.

While the BG remain among the least understood brain areas concerning specific connections, interactions, and operational principles, there exists sufficient evidence to model certain functionalities. This evidence allows for testing the distinct roles of their BG sub-regions. The following presented works focused on modeling these roles and functions, employing simulated agents for action selection and robotic or human-like arms for motion-related tasks.

Prescott et al. (2006) integrated a BG model into a small mobile robot, enabling it to select actions amidst various sensory and motivational conditions. While the model effectively selects between competing actions in most cases, it encounters difficulty when faced with two highly probable actions, resulting in oscillation between two behaviors. Mannella and Baldassarre (2015) noted that previous assumptions on how action selection works in the BG are challenged by new perspectives, and proposed a computational model accounting for these. The model was tested on a simulated 3-DOF joint-actuated arm for target reaching tasks, and a 20-DOF hand to reach specific postures. It provided a successful implementation of new properties observed in the BG and not tested before, although they used some biologically implausible simplifications such as supervised learning in certain areas, pointing toward possible further work to bring it closer to biology.

In González-Redondo et al. (2023), the authors proposed a computational model designed to associate complex input patterns with rewarded actions, enabling informed decision-making. A spiking model of the stratium, a BG component, was implemented to perform RL tasks with a simulated agent. This study underscores the pivotal role that different connections and mechanisms within the BG play in facilitating effective action selection. Baladron et al. (2023) adopted a systems-level approach in their implementation, including a cortex-basal ganglia loop. Their system uses novelty-based Hebbian learning to update the interconnections, selecting actions that drive their robotic arm's end-effector to novel positions.

4.3.2 Cerebellum-based controllers

While the previous brain areas contribute significantly to biological motor control and have been modeled to different extents for robotics, the cerebellum distinguishes itself by directly aligning with two pivotal tasks in robotics control: online error correction and dynamics modeling (Albus, 1971; Itō, 1984; Wolpert and Kawato, 1998; Wolpert et al., 1998). The cerebellum is capable of short-term and long-term adaptation (Wulff et al., 2009; Wang et al., 2014). In the short-term adaptation, it swiftly rectifies inaccuracies arising from the motor cortex commands, and effectively rejects disturbances from the external environment. This adaptation facilitates rapid adjustments in motion. In contrast, long-term adaptation in the cerebellum learns detailed dynamic models over time, encompassing the body and environment. This allows for the prediction of sensory and motor outcomes based on the current state and undertaken actions. Consequently, it refines motion beyond the limitations of relying solely on signals from the motor cortex.

The cerebellum has also been suggested to assist in other motion control tasks or even more general learning tasks (Sendhilnathan et al., 2020). Among them, the one that has been mostly studied and tested in humanoid robots with head and eyes is the function of the vestibulo-ocular reflex (VOR) (Troost, 1984), which stabilizes the gaze of the eyes while the head is in motion.

The cerebellum capabilities if reproduced successfully with computational models, may allow novel robotic systems to overcome the issues posed by their nonlinearities and many degrees of freedom, by adapting online to learn the system dynamics or any change in these, as well as to quickly correct any errors that may arise during the motion task due to unexpected disturbances. If successfully replicated through computational models, the capabilities of the cerebellum could potentially empower robots to surmount these challenges.

In Carrillo et al. (2008), authors introduced a real-time spiking model of the cerebellum based on EDLUT (Ros et al., 2006), a SNN simulator developed by their research group. They controlled a 2-DOF simulated and physical robot arm, showing dynamic adaptation to different tasks. This constituted the first real-time application of SNN for robot control. The same research group has developed further work on SNN for robot control based on EDLUT. In Luque et al. (2011a), they showed a cerebellar network that on top of the sensory inputs, it receives additional information regarding the context of the task of a robotic arm. This allows for faster adaptation to newer contexts and for robustness against misleading contextual information. This work was extended in Luque et al. (2011b) by combining a feedforward and a recurrent network topology, which shows robustness against noise. In Luque et al. (2014), they performed further experiments on a simulated robot to show incremental learning of different dynamics models with minimal mutual interference. Garrido et al. (2013) also showed a detailed biomimetic cerebellum architecture that controls a simulated robot arm, and remarked its ability for showing short-term error compensation and long-term adaptation when the model is changed. More recently, Abadia et al. (2021a) have implemented a larger scale SNN based on EDLUT, showing real-time continuous learning on a real compliant robot against unstructured interactions. They also show in Abadia et al. (2021b), that this cerebellar network can deal with non-deterministic delays in robotics applications. Finally, based on the same SNN simulator, Naveros et al. (2020) developed a real time control loop to operate a real robot humanoid performing different VOR tasks.

In an alternate work mentioned in the high-level BC section, DeWolf et al. (2016) and DeWolf et al. (2023) proposed a method that models several brain motor areas using SNNs. However, their focus lies not in replicating the structural properties but rather capturing the functionality of these regions. Notably, their cerebellum model lacks detailed anatomical representations of the microcircuit; instead, it is represented by a generalized SNN. This cerebellar module is designed to learn the inverse dynamics model of the system, targeting the inertia and gravity components within the torque dynamics equation. Simultaneously, it provides corrective control signals to counteract inaccuracies arising from the primary motor cortex module's generated control actions.

Prior works relied on obtaining comprehensive online measurements of robot joint states, essential for the cerebellar model's computation of required torques through inverse dynamics. However, scenarios might arise where only partial joint state information, e.g., joint angles, is available online. In this case, the limitation renders the accuracy of the dynamics model, as it depends on non-measurable states. To tackle this challenge, Zahra et al. (2021b) introduced an innovative solution—an SNN-based differential map function akin to a Jacobian projection. This map relates task-space to joint-space, compensating for incomplete measurements. Their approach involved an optimization procedure to determine the network's hyperparameters, followed by offline training using data gathered from motor babbling. This differential map concept was further explored in Zahra et al. (2021a), where it was combined with a cerebellar network to control a real robot arm. This integration implemented a Smith-predictor (Smith, 1957), a control structure adept ad handling delays, taking inspiration from Tolu et al. (2020). In this structure, the cerebellum acted as a forward-dynamics model, offering task-space sensory adjustments to the differential map. Consequently, the map generated joint commands based on task-space information. Building upon this foundation, the work presented in Zahra et al. (2022a) refined the previous model, introducing a more detailed network and an optimization-driven method to fine-tune the entire network's hyperparameters. This enhanced model showcased its efficacy in controlling a robot arm amidst disturbances, demonstrating significant improvements over prior iterations

Most of the previously presented works adopted a bio-plausible approach, closely emulating the cellular mechanisms found within the cerebellum. However, there is a distinct line of research that diverges from biological plausibility, employing computational models of the cerebellum to achieve remarkable results in robotics tasks. One such noteworthy contribution is the work by Tolu et al. (2012, 2013), where they implemented a modified adaptive filter model (Fujita, 1982; Dean et al., 2010) of the cerebellum. Their approach used the incremental learning algorithm Locally Weighted Projection Regression (LWPR) (Vijayakumar et al., 2005) for long-term dynamics model learning. LWPR is a regression technique that can model non-linear functions in high-dimensional spaces by combining locally linear models that are created and updated online, making it suitable for incremental learning. Additionally, Tolu et al. integrated an extra module for fast adaptation and disturbance rejection, leveraging LWPR's receptive fields (the membership function of each linear model) and updating based on feedback error. This methodology demonstrated successful control of a simulated multi-DOFs robot arm executing cyclic trajectories despite the presence of external disturbances. Continuing this trajectory, Capolei et al. (2019) extended this work by employing a larger cerebellum model to tackle a more complex task: balancing a ball on a table using the arm of a simulated iCub robot. In Capolei et al. (2020), the same researchers introduced an augmented architecture featuring additional synapses and learning rules, aligning more closely with biological principles but still without spiking neuron models. While showcasing good tracking accuracy and robustness against perturbations and noise in controlling 3-DOFs in the simulated iCub robot arm, this model struggles to generalize to all tested scenarios. The authors argued that a more biologically plausible architecture, incorporating additional brain areas could potentially enhance the results. In a similar vein, Tolu et al. (2020) presented a cerebellar-based Smith predictor. Their results highlighted anticipation and adaptation against dynamic changes, demonstrating a significant improvement for tracking accuracy on a real robot arm. The cerebellar model used by Tolu et al. (2012, 2013) was also tested on an underwater robot in simulation, showing the ability to learn a dynamics model with cross-coupling effects and disturbance rejection (Alepuz et al., 2022).

Another work that proposed a functional model of the cerebellum based on echo state networks (ESN) is found in Kalidindi et al. (2019). This model is adept at learning to supplement an approximate inverse kinematics model, specifically in the context of a soft-robot arm engaged in trajectory tracking tasks. In Zhang et al. (2023), the authors use a functional model of the cerebellum and control a musculoskeletal robot. A recurrent neural network acts as the primary motor cortex sending motor commands to the robot based on the targets to be reached. Then, the cerebellar model, which is composed by two networks, first predicts the outcome of these commands with an ESN, and then sends corrective signals with a second network that learns using bio-plausible learning rules. The motion performance greatly improves with the use of the cerebellar model.

An alternate approach to modeling the cerebellum involves implementing the adaptive filter (AF) hypothesis (Fujita, 1982; Dean et al., 2010). In this framework, the cerebellum microcircuit is replicated as a set of second-order low-pass filters with different time constants and multiplicative weights assigned at their outputs. These weights are adapted online from an error signal. The weighted sum of all the filters outputs allows for approximating the system dynamics. Wilson et al. (2016) implemented an AF model of the cerebellum together with an inverse model of the plant represented by a brainstem model, that are used to control a nonlinear electroactive polymer actuator in a VOR application. In Wilson et al. (2021), the same authors have used a similar model of the cerebellum applied to a range of tasks in robot adaptive control and sensorimotor processing. More recently, Wilson (2023) used an AF to control the force of a biomimetic muscle model, by converting the processed AF signals into spiking signals.

Recent evidence suggests a significant role for the cerebellum in supporting RL processes in the brain. Liu et al. (2020) investigated and introduced a cerebellar model that enables RL without relying on explicit teacher signals. This model exhibited success in accomplishing a target-reaching task, demonstrating its effectiveness in both simulated human arm and a real robot arm.

5 Conclusion

The works presented in this review on brain-inspired biomimetic control reproduce models of the brain motor areas to different degrees of biological plausibility, conditioned by the current neuroscientific knowledge of these areas.

In some neural areas like the motor cortex and BG, the exact network structure remains unclear for replication in a bio-plausible model. Consequently, functional models are utilized. Conversely, areas like the cerebellum and spinal cord (e.g., CPGs) benefit from a deeper understanding, enabling bio-plausible implementations. Functional models of the motor cortex and BG handle tasks involving high-level information processing, such as planning and generating motor commands based on objectives, and action selection and RL. However, their bio-plausible implementations are limited due to lack of structural knowledge. Furthermore, only simple robotic tasks have been addressed, which currently are easily solved by conventional non-biomimetic methods, like trajectory planners or basic RL. For more advanced tasks (e.g., more complex environments, behaviors and goals), more advanced bio-plausible models of these areas will be necessary. On the contrary, the cerebellum and spinal cord CPGs address fundamental low-level control issues necessary before integrating higher behavioral complexity. Even for basic tasks, accurate execution relies on a robust dynamics model (cerebellum) and adaptive locomotive patterns (CPGs). Hence, currently robotics predominantly focuses on these areas for biologically inspired computational models.

While some works attempt a comprehensive representation of the brain motor system, they often lack complete biological plausibility. Future endeavors should concentrate on modeling multiple areas with bio-plausible frameworks, establishing their interactions (e.g., cerebellum-BG, cerebellum-CPGs) and exploring unexplored brain regions like the brainstem. As robotics evolves to tackle more intricate tasks, holistic models encompassing the entire motor cortex, sub-structures, and their interactions will become imperative.

An important consideration in the development and validation of these biomimetic controllers is the use of real robots vs. simulations. While simulations offer a controlled and safe environment for initial testing and validation with fast iterations, they inevitably fail to capture all the complexities of the real world. On the other hand, physical robots face real-world dynamics with uncertainties and disturbances, providing the ultimate scenario for evaluating the robustness and adaptability of these models. However, working with physical systems introduces additional challenges, such as real-time constraints, sensor noise, and hardware limitations.

Indeed, many of the works presented in this review acknowledge the complex nature of real robotics tasks. Even if testing only in simulation, the introduction of artificial noise and disturbances aims to prove their methods robust and able to generalize to the real world. At the same time, the works that test on real robots demonstrate a higher maturity of their methods and readiness to use in real scenarios. As discussed, the cerebellum plays a big role in noise and disturbance rejection, making it a crucial component for the success of future biomimetic controllers. Ultimately, a combination of simulations and real-robot experiments is likely necessary, with simulations serving as a initial development and testing platform, and real-robot experiments providing the final validation and refinement of these models.

Author contributions

AM: Conceptualization, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing. DP: Conceptualization, Supervision, Writing – review & editing. ST: Conceptualization, Methodology, Supervision, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Abadia, I., Naveros, F., Garrido, J. A., Ros, E., and Luque, N. R. (2021a). On robot compliance: a cerebellar control approach. IEEE Trans. Cybern. 51, 2476–2489. doi: 10.1109/TCYB.2019.2945498

PubMed Abstract | Crossref Full Text | Google Scholar

Abadia, I., Naveros, F., Ros, E., Carrillo, R. R., and Luque, N. R. (2021b). A cerebellar-based solution to the nondeterministic time delay problem in robotic control. Sci. Robot. 6:eabf2756. doi: 10.1126/scirobotics.abf2756

PubMed Abstract | Crossref Full Text | Google Scholar

Akopyan, F., Sawada, J., Cassidy, A., Alvarez-Icaza, R., Arthur, J., Merolla, P., et al. (2015). Truenorth: design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Trans. Comput.-Aided Des. Integr

留言 (0)

沒有登入
gif