brain-inspired intelligent robotics (pdf, 6 mb) - science · group director of robotic theory of...

A Sponsored Supplement to Science

Brain-inspired intelligent robotics: The intersection of robotics and neuroscience

Produced by the Science/AAAS Custom Publishing Office

Sponsored by

Imag

e: ji

m/A

dobe

Sto

ck

NOW ACCEPTING MANUSCRIPTS

Science Robotics is a unique journal created to help advance the research and development of robotics for all environments. Science Robotics will provide a much-needed central forum to share the latest technological discoveries and to discuss the � eld’s critical issues.

Join in the excitement for the debut issue coming December 2016!

ScienceRobotics.org

Be Among the First to Publish in Science Robotics

TABLE OF CONTENTS 1

Brain-inspired intelligent robotics: The intersection of robotics and neuroscience

About the cover: A robot hand reaches out for a mechanized brain, floating in a sea of magnified neurons. As we better understand the workings of the brain, we are able to create machines that can better mimic its ability to sense, learn, and react, with the aim of creating a new generation of more intelligent robots. This supplement was produced by the Science/AAAS Custom Publishing Office and sponsored by the publishing house of the Chinese Academy of Sciences.

Editors: Sean Sanders, Ph.D.; Jackie Oberst, Ph.D.Proofreader/Copyeditor: Bob FrenchDesigner: Amy Hardcastle

Materials that appear in this supplement were not reviewed or assessed by the Science editorial staff. Articles can be cited using the following format: [AUTHOR NAME(S)] [CHAPTER TITLE] in Brain- inspired intelligent robotics: The intersection of robotics and neuroscience (Science/AAAS, Washington, DC, 2016), p. [xx-xx].

Yan Xiang, Ph.D.Director, Global Collaboration and Publishing Services China (Asia)Custom [email protected]+86-186-0082-9345

© 2016 by The American Association for the Advancement of Science. All rights reserved. 16 December 2016

Introductions 2 Realizing intelligent robotics Jackie Oberst, Ph.D. Sean Sanders, Ph.D. Science/AAAS

3 Innovating at the intersection of neuroscience and robotics Hong Qiao, Ph.D. Professor with the 100 Talents Program of the Chinese Academy of Sciences Group Director of Robotic Theory of Application, Institute of Automation, Chinese Academy of Sciences (CASIA) Deputy Director, Research Centre for Brain-Inspired Intelligence, CASIA Core Expert, CAS Center for Excellence in Brain Science and Intelligence Technology (CEBSIT)

Articles 4 Creating more intelligent robots through brain-inspired computing Bo Zhang, Luping Shi, Sen Song

9 Deep learning: Mathematics and neuroscience Tomaso Poggio

12 Collective robots: Architecture, cognitive behavior model, and robot operating system Xiaodong Yi, Yanzhen Wang, Xuejun Yang et al.

16 Toward robust visual cognition through brain-inspired computing Pengju Ren, Badong Chen, Zejian Yuan et al.

20 Brain-like control and adaptation for intelligent robots Yunyi Jia, Jianguo Zhao, Mustaffa Alfatlawi et al.

25 Neurorobotics: A strategic pillar of the Human Brain Project Alois Knoll and Marc-Oliver Gewaltig

35 Biologically inspired models for visual cognition and motion control: An exploration of brain-inspired intelligent robotics Hong Qiao, Wei Wu, Peijie Yin

39 Anthropomorphic action in robotics Jean-Paul Laumond

42 Actor–critic reinforcement learning for autonomous control of unmanned ground vehicles Xin Xu, Chuanqiang Lian, Jian Wang et al.

47 Compliant robotic manipulation: A neurobiologic strategy Hong Qiao, Chao Ma, Rui Li

50 Declarative and procedural knowledge modeling methodology for brain cognitive function analysis Bin Hu, Yun Su, Philip Moore et al.

2 BRAIN-INSPIRED INTELLIGENT ROBOTICS: THE INTERSECTION OF ROBOTICS AND NEUROSCIENCE

O ften invoking the future, various cultures throughout history have been obsessed with the idea of robots. Leonardo da Vinci detailed a humanoid robot in one of his notebooks. In third-century BC China, a mechanical figure was presented to King

Mu of the Zhou Dynasty, as described in the book Liezi. There have even been tea-serving traditional Japanese puppets—known as karakuri—made by Hisashige Tanaka, described as “Japan’s Edison.”

Robotics is not just fodder for fiction, despite the word being coined by Russian-born American sci-fi writer Isaac Asimov in his 1942 short story “Runaround.” The field has become multidisciplinary, borrowing from engineering, mathematics, computer science, and more recently, neuroscience. Researchers from these fields are trying to build better robots and also to better humanity through their use.

One prominent area of robotics research is the Center for Excellence in Brain Science and Intelligence Technology (CEBSIT) at the Chinese Academy of Sciences (CAS). The mission of CEBSIT is “to tackle fundamental problems in the areas of brain sciences and brain-inspired intelligence technologies.” One way it seeks to accomplish this mission is to develop brain-inspired hardware including intelligent devices, chips, robotic systems, and brain-inspired computing systems. Important work in this field is also being performed at the Institute of Automation at CAS, particularly in the Laboratory of Research and Application for Robotic Intelligence of “Hand-Eye-Brain” Interaction, which focuses on developing intelligent technologies, including so-called “brain-inspired intelligence.”

In this supplement, researchers from across China and around the world discuss the latest challenges and advancements in their field. For many functions, such as grasping, vision, and memory, researchers are turning to neurobiology for inspiration. Building humanoid or anthropomorphic ro-bots remains the ultimate goal, but since the robots’ memory and learning algorithms are patterned after the human brain, there is still much to learn about neurobiology before that goal is attained.

The European Union-funded Human Brain Project’s Neurorobotics Platform, of which the Robotics Laboratory is a participant, aims to further understand the brain and translate this knowledge into common products such as robots. Two schools of thought have emerged in robotics: bio-logically inspired robots that include a body, sensor, and actuators, and a brain-inspired computing robot.

Applications of the latest models from these two schools include unmanned ground vehicles and “collective robots”—multiple automatic robots that perform tasks collaboratively. When applications such as these start to become commonplace, the Age of Robots may soon be a reality.

The Greek philosopher Aristotle wrote, “If every tool, when ordered, or even of its own accord, could do the work that befits it … then there would be no need either of apprentices for the master workers or of slaves for the lords.” If toolmaking is what distinguishes humans from animals, then robots, even if fashioned in our likeness, could evolve us even further.

Jackie Oberst, Ph.D.Sean Sanders, Ph.D.Custom Publishing OfficeScience/AAAS

Realizing intelligent roboticsIf toolmaking is what distinguishes humans from animals, then robots, even if fashioned in our likeness, could evolve us even further.

INTRODUCTIONS 3

W e are pleased to introduce this special supplement, “Brain-inspired intelligent robotics: The intersection of robotics and neuroscience,” which presents recent research advances in this interdisciplinary area and proposes future

research directions.Robots have found increasing applications in industry, service, and

medicine due in large part to advances achieved in robotics research over the past decades, such as the ability to accomplish complex manipula-tions that are essential for automated product assembly. Despite these developments, robotics still has many technical bottlenecks to overcome. Robots still lack truly flexible movement, have limited intellectual percep-tion and control, and are not yet able to carry out natural interactions with human. These deficits are especially critical in service robots, where facile human–robot interactions are essential. A critical concern of government, academia, and industry is how to advance R&D for the key technologies that can bring about the next generation of robots.

Developing robots with more flexible manipulation, improved learning ability, and increased intellectual perception will achieve the goal of mak-ing these machines more human-like. One possible pathway to success in building next-generation robots is through brain-inspired intelligent robotics, an interdisciplinary field that brings together researchers from robotics, neuroscience, informatics, and mechatronics, among other areas. Brain-inspired intelligent robotics aims to endow robots with human-like intelligence that can be either application-oriented or mechanism-oriented. Application-oriented robotics focuses on mimicking human functions by using new models or algorithms borrowed from information science. However, such robots are usually designed for specific tasks and their learning ability is poor compared with that of humans. Mechanism-oriented robotics attempts to improve robot performance by mimicking the structures, mechanisms, and underlying principles of human cognitive function and movement. It therefore requires the close collaboration of researchers from both neuroscience and robotics.

Several brain projects are underway in the United States, Europe, Ja-pan, and other countries, serving to promote neuroscience research. The interaction between neuroscience and information science central to these projects serves to advance research in brain-inspired intelligence, generating breakthroughs in many related fields. Advances in neurosci-ence research help to elucidate the mechanisms and neuronal circuitry underlying different mental processes in the human brain, providing the basis for new models in robotics research. Similarly, robotics research can provide new ideas and technologies applicable to neuroscience.

In this supplement, we provide examples of cutting-edge achievements in brain-inspired robotics, including both software and hardware devel-opment. The authors offer current perspectives and future directions for researchers interested in this interdisciplinary area, covering topics includ-ing deep learning, brain-inspired computing models, anthropomorphic action, compliant manipulation, collective intelligence, and neurorobotics. Due to space limitations, we regret that some exciting achievements from various frontiers in brain-inspired robotics could not be included.

We hope this supplement draws worldwide attention to this important area and promotes broader communication and collaboration on robot-ics, neuroscience, and intelligence science.

Hong Qiao, Ph.D.Professor with the 100 Talents Program of the Chinese Academy of Sciences Group Director of Robotic Theory of Application, Institute of Automation, Chinese Academy of Sciences (CASIA)Deputy Director, Research Centre for Brain-Inspired Intelligence, CASIACore Expert, CAS Center for Excellence in Brain Science and Intelligence Technology (CEBSIT)

Innovating at the intersection of neuroscience and roboticsBrain-inspired intelligent robotics aims to endow robots with human-like intelligence that can be either application-oriented or mechanism-oriented.


Creating more intelligent robots through brain-inspired computing

Bo Zhang1,4*, Luping Shi2,4, and Sen Song3,4

The great success achieved in building the digital universe can be attributed to the elegant and simple von Neumann architecture of which the central processing unit (CPU) and memory are two crucial components. The scaling up of CPU and memory, which both follow Moore’s law, has been the main driving force for computers for over half a century. Yet in March 2016, the semiconductor industry announced that it is abandoning its pursuit of Moore’s law because device scaling is expected to soon reach its physical limit (1). Therefore, improvements in computers in the post-Moore’s law era must be based on radically new technologies. Brain-inspired computing (BIC) is one of the most promising technologies (1). By deriving inspiration from the architecture and working mechanisms of the brain, BIC systems (BICS) can potentially achieve low power consumption, high robustness, and self-adaptation, and at the same time handle multimodal data in complex environments, which will be conducive to the development of systems that function and learn autonomously. We use the term “BICS” here instead of “neuromorphic computing” because it is more inclusive and, in our opinion, future computer architecture will be a hybrid of neuromorphic and nonneuromorphic components. Here we review the why, what, and how of developing BICS.

The need for BICSAs summarized in Figure 1, certain capabilities of the

computer far surpass those of the human brain, includ-ing computation speed and accuracy, and memory access speed, lifetime, capacity, and accuracy. Conversely, certain capabilities of the brain trump those of the computer, in-cluding adaptability, robustness, flexibility, and learning abil-ity. Accordingly, computers and humans each have different advantages in their task performance. For example, a one-year-old child has a built-in ability to identify his or her par-ents in a viewpoint-invariant way in a complex environment, a task that would be challenging for a computer. Conversely, computers can accurately recall long lists of numbers and

perform complex calculations, tasks that would be a marvel for humans. Thus, it stands to reason that a computer system incorporating the advantages of both computers and the brain would have capabilities far exceeding those of current computers, and perhaps the brain as well.

What are some of the problems with current von Neu-mann architecture that limit its ability in tasks at which humans excel? A critical issue is that in von Neumann archi-tecture, the CPU and memory are separate. The CPU speed has grown at a faster pace than memory speed, creating a so-called “memory wall effect” caused by this speed-rate mismatch, which greatly reduces the computer’s efficiency. However, BICS can significantly enhance computing

performance while at the same time greatly mitigating the influence of the memory wall effect and reducing energy consumption. It also can provide flexibility, robustness, low power, and real parallel computation capability, and facili-tate integration of new brain-like models and algorithms, making it highly suitable for the development of intelligent robots.

The current status of BICSRecently the development of BIC technologies has

become the focus of intensive efforts, with many possible solutions proposed (2–35). One example is the TrueNorth chip and Compass software system, developed by Modha et al. (2–6), which provides a scalable, efficient, and flexible non–von Neumann architecture based on a neuromorphic system. It consists of 4,096 neurosynaptic cores intercon-nected to form an intrachip network, which uses silicon technology to integrate 1 million programmable neurons (that communicate using signal events in the form of “spikes,” as occurs in biological neurons) and 256 million configurable synapses.

Another example is the SpiNNaker system, developed by Furber et al. (7, 8), which is built using standard von Neumann computing blocks and comprises a parallel 1,000-core computer interconnected with low-power

1Department of Computer Science and Technology, Tsinghua University, Beijing, China2Optical Memory National Engineering Research Center, Department of Precision Instruments, Tsinghua University, Beijing, China3Department of Biomedical Engineering, Tsinghua University, Beijing, China4Center for Brain-Inspired Computer Research, Tsinghua University, Beijing, China

*Corresponding Author: [email protected]

FIGURE 1. A comparison of the relative advantages of computers versus brains.

mailto:[email protected]

CREATING MORE INTELLIGENT ROBOTS THROUGH BRAIN-INSPIRED COMPUTING 5

Acorn RISC Machine (ARM; RISC, reduced instruction set computing) processor cores. It is suitable for modeling large-scale spiking neural networks with bioplausible real-time performance. The SpiNNaker platform is designed to deliver a broad capability that can support research exploring the information-processing principles at work in the brain.

The NeuroGrid, developed by Boahen et al. (9, 10), is a neuromorphic system for simulating large-scale neural models in real time. It comprises mixed-signal analog/digital circuits capable of executing important synaptic and neuronal functions such as exponentiation, thresholding, integration, and temporal dynamics.

The BrainScaleS system, developed by Meier et al. (11, 12), is composed of wafer-scale arrays of multicore microprocessor systems that are used to build a custom mixed-signal analog/digital simulation engine, in which each 8-in. silicon wafer integrates some 50 x 106 plastic synapses and 200,000 biologically realistic neuronal circuits. These arrays can be programmed using computational neuroscience models or established software methods. They demonstrate good scalability and real-time operation.

In another approach, Giacomo et al. (13–15) devel-oped a full-custom, mixed-signal, very-large scale integration device with neuromorphic learning circuits for exploring the properties of computational neurosci-ence models and for building BICS. Other research-ers have developed different types of neural network accelerators (16–21) and nanodevices such as memristor networks (which change their resistance based on past history and can naturally emulate synapse plasticity) with the aim of building neuromorphic networks for the development of neuromorphic computing circuits and chips (22–35).

Paths forward in developing BICS

Despite the many po-tential solutions discussed above, there is currently no consensus on the ideal technology (or combina-tions thereof) to develop the ideal BICS. Going forward, there are three main types of hardware approaches: (1) brain-like computing to emulate the primary functions of the brain; (2) BICS to build a new computing architec-ture with guidance from known basic principles of the brain; and, (3) accel-erators of existing neural networks. The key ques-tions for developing BICS are how to build such sys-tems without a full under-standing of the biological mechanisms of the brain and, from a fundamental perspective, what new features can be integrated into BICS that will make them different from current computers. We address these questions below.

The underlying prin-ciples and computation mechanisms of computers and brains have both simi-larities and differences. As put forth by David Marr (36) and depicted in Figure 2, they can be understood on three dif-

ferent levels: design principles, algorithms, and hardware implementation. Marr’s model inspired us to identify fun-damental principles from the brain to guide the design of BICS and to narrow the gap between these two systems.

We posit that the two major differences between computers and the brain are those of the architecture and computation paradigms. In current computers based on von Neumann architecture, the CPU and memory unit are separate and data is centrally stored in the memory unit, as depicted in Figure 3. To deal with more complex problems, the primary strategy is to increase computational speed by increasing the clock rate, which governs the speed of in-formation exchange between the CPU and memory unit so that pieces of data can be accessed sequentially in rapid succession. However, the brain uses a different mode of “computation” than computers; it appears to make full use of resources and spatial complexity by distributing

FIGURE 2. Common features and principles in computers and brains. IC, integrated circuits; CMOS, complementary metal-oxide semiconductor.

FIGURE 3. A comparison of the computational characteristics of von Neumann architecture (temporal complexity) and the human brain (temporal, spatial, and spatiotemporal complexity).


information processing to a large number of neurons across many areas. In the neocortex, each neuron is connected to between 1,000 and 10,000 other neurons through synapses; in some regions of the brain, there are as many as 100,000 connections. This distribution of information processing capacity might help to explain the brain’s energy and computational efficiency.

An innovative yet practical way to develop BICS is to take advantage of computing paradigms inspired by the latest developments in neuroscience research. The full understanding of mechanisms underlying brain function remains a distant goal. However, many of the brain’s fun-damental computational principles, or “algorithms,” have been elucidated or soon will be, which makes possible the transfer of ideas from ongoing research in neuroscience to BICS design. As we discuss below, we have identified fundamental principles that may prove useful for develop-ing BICS. We propose to embed them in a BICS design to create an efficient programming environment that can accommodate rapidly evolving algorithms (37).

To build a BICS, the first step is to design a new architec-ture. We propose that as a general characteristic of neural computation, neural network systems exploit resources in the temporal, spatial, and spatiotemporal domains. More-over, the brain exhibits power-law distributions, which are signatures of complexity in each of these domains (38, 39). Nonetheless, the great success of current computers is attributable in part to the von Neumann architecture, thus it should not be disregarded in future computer design. To continue to take advantage of this architecture, we have proposed a hybrid BIC architecture that accommodates and integrates the complexity of the temporal, spatial, and spatiotemporal domains into a single platform (Figure 4) (40). This architecture integrates a new device called a BIC unit (BICU), which handles mainly perception-related tasks using raw data. A BICU can be built based on any technology that provides spatial complexity as well as a combination of spatial and spatiotemporal complexity. Such technologies include neuromorphic, photonic, and quantum devices.

The development of BICS presents challenges in basic theory as well as in the development of hardware sys-tems, software environments, and system integration. To succeed in BICS development, research in the following four areas is needed:

(1) In terms of theory, understanding how to fuse von

Neumann and neuromorphic architectures is essential, specifically, how to express, process, store, and access “knowledge” and data within BICS;

(2) Developing new neural network (NN) models for identifying fundamental brain principles suitable for guid-ing the design and building of models that accommodate functions such as top-down and recurrent connections and excitatory/inhibitory mechanisms, among others;

(3) Building basic network logic modules for BICU tech-nology that will support the use of complexity in spatial and spatiotemporal domains for different NN and learning mechanisms; and

(4) Developing software for the control and manage-ment of hybrid structures that will efficiently allocate, interact with, and schedule knowledge and data.

Based on these considerations, we have developed a cross-paradigm neuromorphic chip called “Tianjic” (Figure 5). Tianjic can support analog neural networks (ANN) and spiking neural networks (SNN) with a high degree of scalability and flexibility, and can accommodate most current NN and learning methods. We have also developed accompanying software. Demonstrations have shown that when integrated into a self-driving bicycle, Tianjic performs multiple brain-like functions, including sensing, tracking, memory, adaptation, analysis, control, and decision-making (40). The bicycle was able to follow a running person and pass obstacles smoothly (Figure 5). The development of such chips paves the way for the practical construction of BICS, which will hopefully achieve a degree of high-level intelligence far beyond the current generation of computers.

Using BICS to develop intelligent robots Today the field of robotics is one of the most dynamic

areas of technological development (41–45). However, the lack of intelligence in current industrial and service robots severely limits their applications. Therefore, the most important issue in advancing this field further is endow-ing robots with cognitive abilities. To this end, we view the two most pressing imperatives as endowing robots with perceptive capabilities and drastically reducing their power consumption. If robots lack perceptive capability in a complex and dynamic environment, it will be difficult for them to support other cognitive functions, let alone to learn and function autonomously.

Intelligent robots must be capable of carrying out the

FIGURE 4. Proposed hybrid architecture of BICS. The red dotted line depicts components inherited from current computer architecture—I/O (input/output), a memory chip, and the central processing unit (CPU)—to which a brain-inspired computing unit (BICU) has been added.

CREATING MORE INTELLIGENT ROBOTS THROUGH BRAIN-INSPIRED COMPUTING 7

following functions:(1) Modeling of complex environments from sensor data;(2) Synergy and redundancy in control;(3) Multisensory information processing and multimodal

integration; (4) Active, interactive, collaborative, and continuous

learning; and(5) Emotional understanding and the ability to interact

with humans in a natural way.BICS are highly suitable for the development of intel-

ligent robots because they are better able to address the imperatives discussed above, namely power consumption, which is especially important for mobile robots, and the practical implementation of perceptive and cognitive capa-bilities, which currently often requires a large, nonmobile computer cluster.

BICS carry out perceptual tasks on raw data with a much higher degree of power efficiency than that of current computers. This reduced energy consumption is one of the most important driving forces behind the development of BICS, and can be attributed to the following: (1) Memory and computing are integrated, which avoids the huge en-ergy consumption associated with data transfer; (2) sparse spatiotemporal and spike-based coding minimizes com-munication overhead; (3) computation is only performed when necessary due to an event-driven mode of operation; and (4) novel passive devices store synaptic weights and memory with almost no power consumption.

For perceptive and cognitive applications in BICS, the von Neumann architecture is convenient for symbolic reasoning, whereas BICU excels in its capacity in perceptual recognition, intuitive judgment, and learning, using complex network representations that integrate information storage and computing capability. By combining the advantages of both von Neumann and neuromorphic architectures with low-power consumption, BICS provide a highly suitable model for the development of intelligent robots (Figure 6).

Opportunity for development of BICS

At present there is no widely accepted paradigm for building BICS. However, as we discuss below, the develop-

ment of BICS appears to be ripe for a breakthrough due to rapid advances in neuroscience, information technology, mathematics, new devices, and great interest from the semiconductor industry.

Over the past decade, the field of neuroscience has accumulated vast amounts of data. With the de-velopment of the two-photon microscope and func-tional magnetic resonance imaging, it is now pos-sible to make brain measurements at scales ranging from single neurons to large-scale brain networks. In addition, optogenetics can be used to precisely control brain activity at the level of single neurons. Thus the field is currently in a state of rapid develop-ment and approaching a threshold for achieving a much greater understanding of the mechanisms of brain function.

In terms of information technology, current su-percomputing technologies can simulate a cortical

column and large-scale, whole-brain models with simplified spiking neurons in exquisite detail. In addition, the devel-opment of Internet technologies and sensor technologies has brought about a big-data revolution in recent years. The resulting massive data trove has generated an intense demand for efficient artificial intelligence methods, and has made it possible to train large-scale, brain-inspired models such as deep convolutional neural networks and recurrent neural networks.

In parallel, a wide range of mathematical branches have started to cross-pollinate, laying the foundation for a com-prehensive and rigorous theoretical framework for BICS. One promising avenue of research is Bayesian probability theory, which has been shown to model aspects of human learning and decision-making. The field of complex systems, which encompasses graph theory–based analysis of network topology and theories of nonlinear differential equations dynamics, also promises to radically advance BICS.

In addition, many recently developed devices can serve to enhance BICS. For example, as mentioned earlier, Stru-kov et al. (22) developed a memristor device for construct-ing a new kind of neuromorphic network by emulating characteristics of synapse plasticity, short-term memory, and long-term memory. It has low power consumption and is compatible with silicon technology.

In the semiconductor industry, the maximum number of integrated transistors on a chip is close to 100 billion, which is higher than the number of neurons in the nervous system of Drosophila and mice. Within the next few years, that number is expected to reach or exceed 100 billion, the number of neurons in the human brain. It is predicted that when technology allows the semiconductor device fabrication node to shrink below 10 nm in size, which is projected to happen in 2017, neuromorphic devices will potentially reach the same level of computational efficiency as biological neurons (46).

Advances in artificial intelligenceInspiration obtained from studying the brain can play an

influential role in the development of artificial agents with human-like general intelligence. As an example, AlphaGo—

FIGURE 5. Photographs of (A) the Tianjic chip, and (B) a self-driving bike controlled by the chip.


software trained to play the deceptively complex ancient Chinese game of Go—represents a recent breakthrough in the development of artificial intelligence (AI) (47). On March 16, 2016, it beat Lee Sedol, one of the world’s top Go play-ers, to win the Go series by four games to one. Previously, the Deep Blue program, another example of AI, used its vast computing power in chess to quickly search through large decision trees, a strategy quite different from that afforded by human intelligence. In contrast, the AlphaGo team used deep neural networks to mimic an aspect of what top Go players might label intuition or perception. In addition, AlphaGo has the ability to acquire skill by analyzing the games of top players through supervised learning, as well as by self-play through reinforcement-learning mechanisms, demonstrating human-like learning capabilities.

Currently, AI largely entails carrying out specific tasks with discrete capabilities. This approach has yielded many interesting technologies. For example, deep learning has dramatically improved speech recognition technologies as well as visual object recognition and detection. In light of these advances, it is an opportune time to focus on artificial general intelligence (AGI), an alternate form of AI (47, 48). The field of AGI views “general intelligence” as a funda-mentally distinct property from task- or problem-specific capabilities, and focuses on understanding the universal properties of intelligence. At the core of AGI is the hypoth-esis that synthetic intelligence with sufficiently broad (e.g., human-level) scope and strong generalization capability is qualitatively different from synthetic forms of intelligence with significantly narrower scope and weaker generaliza-tion capability (49). The flexibility, robustness, low power consumption, and real parallel computation capabilities provided by BICS would facilitate the integration of new brain-like models and algorithms suitable for developing AGI. The general capability of AGI will ultimately be of key importance in the evolution of truly intelligent robots.

ConclusionsTo be considered genuinely intelligent, robots need to

combine the superior capability of computers in perform-ing searches (reasoning) and simulating human delibera-tive behaviors with the advanced capability of neural networks in perception and learning. This approach points to a promising path forward for building AGI systems. Such advances are highly compatible with the proposed hybrid BICS architecture, which can integrate tasks at which current computers excel—such as high-speed symbolic pro-cessing—with tasks at which the human brain excels—such as perception, adaptation, learning, cognition, and even innovation and creativity. Furthermore, it will be important to create rich but controlled environments for robots to interact with the world as testbeds to develop their percep-tual, cognitive, and motor abilities. In summary, BICS prom-ises to greatly aid in the development of AGI, which will ultimately drive robots towards human-like intelligence.

References 1. M. M. Waldrap, Nature 530, 144–147 (2016). 2. P. A. Merolla et al., Science 345, 668–673 (2014).

3. F. Akopyan et al., IEEE T. Comput. Aid. D. 34, 1537–1557 (2015). 4. D. S. Modha et al., Commun. ACM 54, 62–71 (2011). 5. A. Amir, Proceedings of the 2013 International Joint Conference on Neural Networks (IEEE, Dallas, TX, August 2013), pp. 1–6. 6. J. Hsu, IEEE Spectr. 51, 17–19 (2014). 7. S. B. Furber, F. Galluppi, S. Temple, Proc. IEEE 102, 652–665 (2014). 8. S. B. Furber, IEEE Spectr. 49, 46–49 (2012). 9. B. V. Benjamin et al., Proc. IEEE 102, 699–716 (2014).10. P. Gao, B. V. Benjamin, K. Boahen, IEEE Trans Circuits Syst. I Regul. Pap. 59, 2383–2394 (2012).11. K. Meier, 2015 IEEE International Electron Devices Meeting, (IEEE, Washington, DC, December 2015), pp. 4.6.1–4.6.4.12. J. Schemmel, J. Fieres, K. Meier, Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE, Hong Kong, June 2008), pp. 431–438.13. G. Indiveri et al., 2015 IEEE International Electron Devices Meeting (IEEE, Washington, DC, December 2015), pp. 4.2.1–4.2.4.14. G. Indiveri, S. C. Liu, Proc. IEEE 103, 1379–1397 (2015).15. N. Qiao et al. Front. Neurosci. 9, article 141 (2015).16. T. Chen et al., ACM SIGPLAN Notices (ACM, New York, NY, 2014), pp. 269–284.17. D. F. Liu et al., Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ACM, Istanbul, Turkey, March 2015), pp. 369–381.18. Z. Du et al., Proceedings of the 42nd Annual International Symposium on Computer Architecture (ACM, Portland, OR, June 2015), pp. 92–104.19. D. Y. Kim, J. M. Kim, H. Jang, J. Jeong, J. W. Lee, IEEE Trans. 20. Consum. Electr. 61, 555–563 (2015). 20. J. Tanabe et al., 2015 IEEE International Solid-State Circuits Conference (IEEE, San Francisco, CA, February 2015), pp. 1–3.21. Y. H. Chen, N. Fong, B. Xu, C. Wang, 2016 IEEE International Solid-State Circuits Conference (IEEE, San Francisco, CA, January–February 2016), pp. 262–263.

FIGURE 6. The development trend of BICU, showing the relationship between energy/time and spatial complexity. The energy efficiency of BICU (left dashed line) is better than that of von Neumann architecture and can be expected to gradually approach that of the human brain (right dashed line).

DEEP LEARNING: MATHEMATICS AND NEUROSCIENCE 9

22. D. B. Strukov, G. S. Snider, D. R. Stewart, R. S. Williams, Nature 453, 80–83 (2008). 23. J. J. Yang, D. B. Strukov, D. R. Stewart, Nat. Nanotechnol. 8, 13–24 (2013).24. M. D. Pickett, G. Medeiros-Ribeiro, R. S. Williams, Nat. Mater. 12, 114–117 (2013).25. L. Deng et al., Sci. Rep. 5, 10684 (2015).26. Q. Xia et al., Nano Lett. 9, 3640–3645 (2009).27. K.-H. Kim et al., Nano Lett. 12, 389–395 (2011).28. S. Park et al., 2012 IEEE International Electron Devices Meeting (IEEE, San Francisco, CA, December 2012), pp. 10.2.1–10.2.4.29. G. W. Burr et al., 2014 IEEE International Electron Devices Meeting (IEEE, San Francisco, CA, December 2014), pp. 29.5.1–29.5.4.30. J. M. Cruz-Albrecht, T. Derosier, N. Srinivasa, Nanotechnology 24, 384011 (2013).31. S. B. Eryilmaz et al., Front. Neurosci. 8, 205 (2014).32. B. Li et al., International Symposium on Low Power Electronics and Design (IEEE, Beijing, China, September 2013), pp. 242–247.33. T. Tang et al., 2015 Design, Automation and Test in Europe Conference and Exhibition (IEEE, Grenoble, France, March 2015), pp. 860–865.34. R. C. Luo et al., Proc. IEEE 91, 371–382 (2003). 35. X. Lu et al., 2015 52nd ACM/EDAC/IEEE Design Automation Conference (IEEE, San Francisco, CA, June 2015), pp. 1–6.36. G. Marcus, A. Marblestone, T. Dean, Science 346, 551–552 (2014).37. J. Lisman, Neuron 86, 864–882 (2015).38. D. S. Bassett, M. S. Gazzaniga, Trends Cogn. Sci. 15, 200–209 (2011).39. E. Bullmore, O. Sporns, Nat. Rev. Neurosci. 10, 186–198 (2009).40. L. P. Shi et al., 2015 IEEE International Electron Devices Meeting, (Washington, DC, December 2015), pp. 4.3.1–4.3.4.41. E. Burdet, D. Franklin, T. E. Milner, Human Robotics: Neuromechanics and Motor Control (MIT Press, Cambridge, MA, 2013).42. A. J. Ijspeert, A. Crespi, D. Ryczko, J.-M. Cabelguen, Science 315, 1416–1420 (2007).43. X. Hu et al., PLOS ONE 9, e81813 (2014). 44. J. Hertzberg et al., Künstliche Intelligenz 28, 297–304 (2014).45. N. Burgess, J. G. Donnett, J. O’Keefe, Connect. Sci. 10, 291–300 (1998).46. J. Hasler, B. Marr, Front. Neurosci. 7, 118 (2013).47. E. Gibney, Nature 531, 284–285 (2016).48. B. Goertzel, Journal of Artificial General Intelligence 5, 1–48 (2014).49. D. Kumaran, D. Hassabis, J. L. McClelland, Trends Cogn. Sci. 20, 512–534 (2016).

AcknowledgmentsWe are indebted to the following researchers for their helpful discus-sions: the developers of Tianjic, J. Pei, Y. H. Zhang, and D. Wang; and the developers of the self-driving bicycle, M. G. Zhao, Y. Wang, S. Wu, and N. Deng. This work was supported in part by the Study of Brain-Inspired Computing of Tsinghua University (20141080934), the Beijing Municipal Science and Technology Commission, and the National Natural Science Foundation of China (61327902).

Deep learning: Mathematics and neuroscience

Tomaso Poggio

Understanding the nature of intelligence is one of the greatest challenges in science and technology today. Making significant progress toward this goal will require the interaction of several disciplines including neuro-science and cognitive science, as well as computer science, robotics, and machine learning. In this paper, I will discuss the implcations of recent empirical successes in many applications, such as image categorization, face identifica-tion, localization, action recognition, and depth estimation, as well as speech recognition through a machine learning technique called “deep learning,” which is based on multi-layer or hierarchical neural networks. Such neural networks have become a central tool in machine learning.

Science and engineering intelligence

The original idea of hierarchical neural networks came from basic neuroscience studies of the visual cortex con-ducted by Hubel and Wiesel in the 1960s (1). Recordings they made from simple and complex cells in the primary visual cortex suggested a hierarchy of layers of simple, complex, and hypercomplex neurons. These findings were later taken to reflect a series of layers of units comprising increasingly high degrees of complex features and invari-ance. Fukushima developed the first quantitative model of a neural network for visual pattern recognition, called the “neocognitron” (2). Its architecture was identical to today’s multilayer neural networks, comprising convolution, max pooling, and nonlinear units. However, the training of that model differed from today’s, since it did not rely on the supervised stochastic gradient descent (SGD) technique introduced by Hinton (3), which is used in current deep learning networks. Several similar quantitative models fol-lowed, some geared toward optimizing performance in the recognition of images (4), and others toward the original goal of modeling the visual cortex, such as a physiologi-cally faithful model called “HMAX” (5, 6).

Whereas the roots of deep learning architectures are derived from the science of the brain, advances in the performance of these architectures on a variety of vi-sion, speech, and text tasks over the last five years can be regarded as a feat of engineering, combined with the cumulative effect of greatly increased computing power, the availability of manually labeled large data sets, and a small number of incremental technical improvements. It

Center for Brains, Minds, and Machines, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MACorresponding Author: [email protected]


is telling that several of the algorithmic tricks touted as breakthroughs just a couple of years ago are now regard-ed as unnecessary. Some notable examples are “pretrain-ing,” “dropout,” “max pooling,” and “inception modules.” As discussed below, other ideas, such as “data augmenta-tion” [once called “virtual examples” (7)], “batch normaliza-tion,” and “residual learning,” (8) are more fundamental and likely more durable, though their exact form will likely change somewhat.

Deep learning: Scientific breakthroughs?Deep learning appears to be more than a good en-

gineering implementation of existing know-how: It may be the beginning of a breakthrough with the potential to open a new field of science. There are several reasons for this statement; below I discuss the two that are most noteworthy.

The first is related to neuroscience. Deep networks trained with ImageNet, a database of more than 1 million labeled images, seem to mimic not only the recognition performance but also some of the tuning properties of neurons in cortical areas of the visual cortex of monkeys (9). However, there are major caveats to this interpretation. One is that the variance explained by the models is only around 60%. The comparison is made between actual neu-rons and optimal linear combinations of units in a layer of the model, thus neglecting nonuniformities in cortical ar-eas (such as cortical layers, different types of neurons, and face patches). However, for the sake of the argument, let me take an optimistic view of these claims, which lead me to conclude that in order to explain how the visual cortex works, it is sufficient to spell out a computational task, such as a high-variation object categorization task, then find parameter values by optimizing recognition performance of the architecture on the millions of data points that are images and their labels. The constraints on the architecture are rather weak, consisting of the simple nonlinear units and generic shift invariant connectivity used in both the neocognitron and HMAX networks. Since the constrained optimization problem described above yields properties similar to what neuroscience shows, we can conclude that an optimization problem with such weak constraints has a unique solution and, furthermore, that evolution has found the same solution. As mentioned in previous work (9), this seems to imply that the computational level of analysis of both Marr and myself (10) effectively determines the lower levels of the algorithms, and that the details of the imple-mentation are irrelevant.

The second reason deep learning may portend a major breakthrough in science is related to the existing mathe-matical theory of machine learning and the need to extend it. In particular, it seems that trained multilayer networks (deep networks) are able to generalize to other tasks and other databases very different from the ones on which they were originally trained (3). This behavior is quite distinct from and superior to classical one-layer kernel machines (shallow networks), which typically suffer from overfitting issues. Deep convolutional networks trained to perform classification on ImageNet achieve good performance

on different image datasets with little additional training. The same networks can be used to perform somewhat different tasks such as detection or segmentation, again with relatively minor training of an additional output layer. These characteristics are consistent with the intriguing ability of deep learning networks to avoid overfitting and to show a testing error similar to the training error, at least when trained with very large data sets. A new focus in the mathematics of learning and in the field of function ap-proximation is clearly desirable, in order to extend these findings and realize the potential of deep learning.

Deep learning: Open scientific questions and some answers

The ability of deep learning networks to predict prop-erties of the visual cortex seems a major breakthrough, which prompts the obvious question: What if the new technology makes it possible to develop some of the basic building blocks for brain-like intelligence? Of course, training with millions of labeled examples is biologically implausible and is not how children learn. However, it may be possible to largely eliminate the need for labels, which requires that a learning organism should have the ability to group together (with the implicit label of “same identity”) images of the same object (or for classification, of the same object class). This ability of “implicit labeling” could then bypass the need for huge numbers of labeled data. The idea is that when explicit labels are not provided, contextual or other information may often allow implicit labeling. There are several plausible ways for biological organisms, particularly during development, to implicitly label images and other sensory stimuli such as sounds and words. For images, time continuity is a powerful and primitive cue for implicit labeling (11–13). As a back-of-the-envelope estimate, let us consider how many unlabeled images a baby robot could acquire during its first year of “life.” One saccade (i.e., an eye movement that changes the direction of gaze) per second, 8 hours per day for 365 days would provide the robot with over 10 million images. If only a small fraction of these images, say 10%, were implicitly labeled, even this smaller number could provide a sufficiently rich training set, as suggested by empiri-cal results on ImageNet. A single deep learning network architecture certainly cannot deal with the challenging cognitive tasks that humans perform constantly. However, it may be possible to “build a mind” with a set of different modules, several of which are deep learning networks.

I now turn to the set of mathematical questions posed by the successful engineering of deep learning to un-derstand when and why deep networks are better than shallow ones. In answering this question, there are two further sets of questions about deep neural networks to consider. The first relates to the power of the architec-ture: Which classes of functions can it approximate well, and what is the role of convolution and pooling? The second set relates to learning the unknown coefficients from the data: Do multiple solutions exist? If so, how many? Why is the learning technique based on SGD so unreasonably efficient, at least in appearance? Are good

DEEP LEARNING: MATHEMATICS AND NEUROSCIENCE 11

local minima easier to find in deep rather than in shallow networks?

Answers to some of these questions are just begin-ning to emerge. There is now a rather complete theory of convolutional and pooling layers that extends the current neural network design to other classes of transformations beyond the translation group, including nongroup trans-formations, ensuring partial invariance under appropriate conditions. Some of the less obvious mathematical results are as follows (12, 14):

• Pooling—or locally averaging—a number of nonlinear units is sufficient for invariance and for maintain-ing uniqueness [see theorems in section 3 of (12)], provided there is no clutter within the pooling region. Thus, convolution and subsampling as performed in neural networks are sufficient for selective invari-ance, provided there are a sufficient number of units. The form of the nonlinearity used in pooling is rather flexible: Smooth rectifiers and threshold functions are among the simplest choices.

• The size of the pooling region does not affect unique-ness apart from larger regions being more susceptible to clutter. Reducing clutter effects on selective invari-ance by restricting the pooling region is probably a main role of “attention.”

• Max pooling is a form of pooling that provides invari-ant but not strictly selective information. There is no need for max pooling instead of local averaging, apart from computational considerations.

The above results related to pooling do not by them-selves explain why and when multiple layers are superior to a single hidden layer. Using the language of machine learning and function approximation, the following ad-ditional statements can be made (15, 17):

• Both shallow and deep networks are universal. That is, they can approximate arbitrarily well—and in principle learn—any continuous function of d variables (on a compact domain).

• Then what is the difference between shallow and deep networks? We conjectured that a deep network

matching the structure of compositional function (Figure 1) should be “better” at approximating those functions than a generic shallow network. We define “compositional functions” as functions that are hierarchically local; they are compositions

of functions with a small, fixed number of variables. To prove a formal result, one has to compare the complexity (measured in terms of number of units or parameters) of shallow and deep networks. Figure 1 (right) shows a “deep” hierarchical network with eight inputs and the architecture of a binary tree. The network architecture matches the structure of the form’s compositional functions:

f(x1,…x8) = g3(g2(g1(x1,x2),g1(x3,x4)),g2(g1(x5,x6),g1(x7,x8))).

• Each of the nodes in the deep network consists of a set of rectified linear units, so called because each unit implements a rectification of its scalar input (the output is zero if the input is negative or equal to the input). Though not needed for the theorem, Figure 1 assumes shift invariance (equivalent to weight sharing and convolution). Thus, the nodes in each layer compute the same function (for instance, g1, in the first layer). With these definitions, the following theorem, stated informally [for a formal statement see (15)], answers the question of why and when deep networks are superior to shallow networks:

Both shallow and deep networks can approximate a compositional function of d variables. For the same level of accuracy, the number of required parameters in the shallow network is exponential in d, whereas it is linear in d for the deep network.

Notice that the binary tree of Figure 1 (right) is a good model of the architecture of state-of-the-art neural networks with their small kernel size and many layers (8). This figure is very similar to hierarchical models of the visual cortex, which show a doubling of receptive field sizes from one layer to the next.

The results outlined here are only the beginning of stud-ies of multilayer function approximation techniques and related machine learning results. They are of broad inter-est because progress in machine learning needs a theory to build upon the empirical successes of deep learning achieved thus far.

FIGURE 1. On the left is a shallow universal network with one node containing n units or channels. On the right is a binary-tree hierarchical network. Some of the best deep learning networks are quite similar to the binary-tree architecture (8).


Importance of the science of intelligenceIn this article, I have sketched some scientific questions

in both neuroscience and mathematics that will need to be answered. In fact, there may be a deep connection between the two fields:

“A comparison with real brains offers another, and prob-ably related, challenge to learning theory. The ‘learning algorithms’ we have described in this paper correspond to one-layer architectures. Are hierarchical architectures with more layers justifiable in terms of learning theory? It seems that the learning theory of the type we have outlined does not offer any general argument in favor of hierarchi-cal learning machines for regression or classification. This is somewhat of a puzzle since the organization of the cortex—for instance, the visual cortex—is strongly hierarchical. At the same time, hierarchical learning systems show superior performance in several engineering applications.” (16).

Beyond the case of deep learning, I believe that a con-centrated effort in the study of the basic science of intel-ligence—not only the engineering of intelligence—should be a high priority for our society. We need to understand how our brain works and how it generates the mind; this understanding will advance basic curiosity-driven research and contribute greatly to human knowledge. It will also be critically important for future advances in engineering, and for helping us to build robots and artificial intelligences in an ethical manner.

References 1. D. H. Hubel, T. N. Wiesel, J. Physiol. 160, 106–154 (1962). 2. K. Fukushima. Biol. Cybern. 36, 193–202 (1980). 3. Y. LeCun, Y. Bengio, G. Hinton, Nature 521, 436–444 (2015). 4. Y. LeCun et al., Neural Comput. 1, 541–551 (1989). 5. M. Riesenhuber, T. Poggio. Nat. Neurosci. 3, 1199–1204 (2000). 6. T. Serre et al., Prog. Brain Res. 165, 33–56 (2007) 7. P. Niyogi, T. Poggio, F. Girosi, Proc. IEEE 86, 2196–2209 (1998). 8. K. He, X. Zhang, S. Ren, J. Sun, arXiv:1512.03385v1 [cs.CV] (2015). 9. D. L. K. Yamins, J. J. DiCarlo, Nat. Neurosci. 19, 356–365 (2016).10. T. Poggio, Perception 41, 1017–1023 (2012). 11. F. Anselmi et al., Center for Brains, Minds, and Machines (CBMM) Memo No. 001, DSpace@MIT (2014); available at http://cbmm.mit.edu/sites/default/files/ publications/1311.4158v5_opt.pdf.12. F. Anselmi et al., Theor. Comput. Sci. 633, 112–121 (2015).13. X. Wang, A. Gupta, Proceedings of the IEEE International Conference on Computer Vision, (IEEE, Santiago, Chile, December 2015), pp. 2794–2802.14. F. Anselmi, T. Poggio, Visual Cortex and Deep Networks: Learning Invariant Representations (MIT Press, Cambridge, MA, 2016).15. H. Mhaskar, Q. Liao, T. Poggio, arXiv:1603.00988 [cs.LG], (2016).16. T. Poggio, S. Smale, Not. Am. Math. Soc. 50, 537–544 (2003).17. T. Poggio, “Deep Learning: Mathematics and Neuroscience,” (2016); available at http://cbmm.mit.edu/publications/deep- learning-mathematics-and-neuroscience.

AcknowledgmentsThe author is supported by the Center for Brains, Minds and Ma-chines (CBMM), funded by National Science Foundation Science and Technology Centers award CCF-1231216.

State Key Laboratory of High Performance Computing, School of Computer Science, National University of Defense Technology, Changsha, China*Corresponding Author: [email protected]

Collective robots: Architecture, cognitive behavior model, and robot operating system

Xiaodong Yi, Yanzhen Wang, Xuejun Yang*, Shaowu Yang, Bo Zhang, Zhiyuan Wang,

Yun Zhou, Wei Yi, and Xuefeng Peng

To carry out applications in the service of human society, such as exploration of unknown environments and disaster relief missions, robots must be able to autonomously adapt to dynamic and uncertain environ-ments, and accomplish complex tasks with flexibility. As a means of achieving these goals, what we call “collective robots,” a collection of multiple autonomous robots per-forming tasks collaboratively, are likely to become one of the most important types of robots in the future.

To improve their capacity to adapt to their environment and accomplish complex tasks, collective robots must be treated as a living system, an organism whose purpose is to survive in dynamic surroundings and successfully complete tasks. Therein lies the fundamental difference between collective robotics and swarm robotics: Swarm robotics projects, such as Kilobot (1) and swarm-bot (2), comprise multiple simple robots, while collective robots can comprise highly heterogeneous robots, each possessing strong sensing and movement capabilities.

Cooperation among collective robots is essentially different than among multirobots, as seen in hazardous environment exploration (3), unmanned aerial vehicle formation flight (4), and robotic soccer (5). When facing complex tasks in unknown environments, cooperation among collective robots relies on distributed mechanisms that merge autonomous actions from multiple robots, such that a higher level of collective intelligence emerges. By contrast, multirobot cooperation often relies on a certain element of centralized control.

Here, we review the design of the collective structure and cognitive behavior model of collective robots, with micROS providing an example of how to implement morphable, intelligent, and truly collective robotic operating systems.

The collective structureBefore they are capable of accomplishing complex

http://cbmm.mit.edu/publications/biblio?page=2&tid%5B0%5D=1&f%5Bauthor%5D=37

http://cbmm.mit.edu/sites/default/files/publications/1311.4158v5_opt.pdf

http://arxiv.org/abs/1603.00988

http://cbmm.mit.edu/publications/biblio?page=1&f%25255Bauthor%25255D=14

COLLECTIVE ROBOTS: ARCHITECTURE, COGNITIVE BEHAVIOR MODEL, AND ROBOT OPERATING SYSTEM 13

tasks, collective robots must first be able to adapt to and operate within uncertain, dynamic, and unpredictable environments. According to Gödel’s proof of incompleteness, Heisenberg’s uncertainty principle, and the second law of thermodynamics, the character and nature of a system cannot be determined internally, and any efforts to do so will lead to disorder. Since only an open system is capable of fully adapting to changes, collective robots must have an open architecture and continually interact with open environments.

To guarantee their scalability, robustness, parallelism, and flexibility, collective robots should conform to a distributed organizational structure in which multiple nodes form an interconnected network and enable distributed control—this structure ensures that collective robots are decentralized, autonomous, and adaptive.

The collective structure exerts enormous effects on the behavior of collective robots. In general, the collective structure should be flexible and cellular, and be such a networked structure as is formulated through design concepts, experiences, mission objectives, and oriented modes. This structure leads to abundant nondeterminancy that is of benefit to the system’s revolutionary, innovative, and active evolution. Classical organizational structures include hierarchies, holarchies, coalitions, teams, congregations, federations, markets, matrices, and societies (6). Collective robots should be capable of autonomously formulating a variety of organizational structures according to different environments and tasks. Moreover, they should also be able to dynamically evolve into new structures when environments and tasks are changed. The collective structure should also be capable of integrating external, heterogeneous robots when needed.

Figure 1 illustrates collective robots providing services for human users in two domains of the urban environment. In the outdoor domain, unmanned aerial vehicles and unmanned ground vehicles improve mobility through cooperative sensing, planning, and data sharing. Indoors, the robots collaboratively work

with computers and intelligent terminals for improving the quality of services.

A model for cognitive behavior Existing as a part of an organic whole, individual com-

ponents of collective robots must cooperate with each other to achieve harmonious behavior. The collective should be capable of adapting to dynamic environments proactively, not just passively. Achieving the characteris-tics of dynamism, speed, harmony, and initiative, the col-lective robots must possess key attributes of cognition and behavior, as follows:(1) Observation: As an open system, collective

robots must continuously acquire information from the surrounding environment through proactive observation rather than passive sensing. To improve the observational capability and effectiveness of the entire collective, diversity of observation within the system must be respected. Collective observation activities and perceived content should be circulated in the system to obtain a shared and enriched understanding for the col-lective.(2) Orientation: As the key to cognition and behavior,

orientation is defined as the paired acts of hypothesizing and testing, analyzing and synthesizing, and destroy-ing and creating. It is a dynamic and continuous internal interaction. The results of orientation are represented as images, views, or impressions of the world shaped by genetic heritage, cultural tradition, previous experi-ences, or unfolding circumstances. Orientation brings insight, focus, and direction to the act of observing. As the pivotal point of control, orientation shapes the way in which collective robots interact with the environment.(3) Interaction: Autonomous individuals and the

interactions among them form the basis of collective intelligence. Harmony, focus, and direction in the collective’s operations are created by the bonds of implicit communication and trust that evolve as a consequence of the similar mental images or impressions each individual creates and commits to memory. These include common insights, orientation patterns, and

FIGURE 1. Left: Collective robots interacting in an outdoor urban environment. Right: The collective structure of the robots in an indoor domain, where the nodes in the black, dashed circles represent the robots, and the cloud represents different environments interacting with robots and humans.


previous experiences. In this way, it is not only the subjective initiative of each individual that can be mobilized, but also the collective intent.(4) Decision and action: Observations that match

up with certain mental schema call for certain decisions and actions. A “decision” occurs when each individual of the robot collective decides among action alternatives generated in the orientation phase. “Action” takes place when operations affecting the environment are carried out according to the decision. Previous research has shown that systems comprising more diverse and independent members make better choices (9).(5) Feedback loop: Feedback is a key mechanism

for an open system to implement self-regulation and self-organization. Besides enforcing the action-control component, feedback also allows for a greater num-ber of choices, new strategies, and decision rules to apply. It leads to the questioning of assumptions and alternative ways of seeing a situation. When one be-comes exposed to a new perception or experience that challenges existing schemas, a process of reorganiza-tion and adaptation occurs, leading to new schemas. Establishment of a rapid feedback loop is essential to promote the adaptability of a system exposed to dynamic and uncertain environments and tasks.

The OODA model for cognitive behaviorBased on the elements described above, the “ob-

serve-orient-decide-act” (OODA) model or loop can be constructed, as depicted in Figure 2. The remaining elements, namely the interaction and feedback loop, are implicitly included in this model. The OODA loop is a well-known decision cycle proposed by John Boyd

for confrontational operations (7), which has been successfully extended to modeling complex behaviors in commercial and social domains.

The OODA loop exhibits the evolving, open, and purposely unbalanced process of self-organizing and emerging collective intelligence demonstrated in collective robots. The observe module of the OODA loop manages the sensory data and performs information filtering, compression, and fusion. The orient module then conducts analysis and synthesis of the environmental input. The decide module performs the hypothesizing, selection, and planning of the robots’ actions according to the required tasks and environmental situation. Finally, the act module carries out the control process according to the plan, as well as any feedback received, and performs continuous reassessments based on the most current information.

Collective intelligenceIn an attempt to form collective intelligence by

polymerizing the autonomous behaviors of collective robots through the OODA loop, the following four research topics have become the subject of intensive study (8): (1) Autonomous observation and collective

perception: Individual machines within collective robot systems may have different perception capabilities, and thus may assume different roles in perception. The perception–information fusion and the perception–behavior management processes will need to take these different roles into consideration. (2) Autonomous orientation and collective cognition:

Autonomous orientation of robots is achieved through machine learning technologies. A collective

FIGURE 2. John Boyd’s OODA loop [reproduced from (7)].

COLLECTIVE ROBOTS: ARCHITECTURE, COGNITIVE BEHAVIOR MODEL, AND ROBOT OPERATING SYSTEM 15

behavioral model, is composed of observation, orientation, decision, and action modules.

Conclusions and future directionsIn this paper we illustrated the collective structure and

the cognitive behavior model of collective robots. We also proposed the design of a morphable, intelligent, and collective robot operating system (micROS) for the management, control, and development of collective robots. In the future, we plan to focus our research on autonomous behaviors and collective intelligence, as well as devoting attention to the further development of micROS.

References 1. M. Rubenstein, A. Cornejo, R. Nagpal, Science 345, 795–799 (2014). 2. R. Gross, M. Bonani, F. Mondada, M. Dorigo, IEEE Trans. Robot. 22, 1155–1130 (2006). 3. M. Eich et al., J. Field Robot. (Special Issue on Space Robotics, Part 2) 31, 35–74 (2014). 4. B. D. O. Anderson, B. Fidan, C. Yu, D. Van der Walle, in Recent Advances in Learning and Control, Vol. 371, V. D. Blondel, S. P. Boyd, H. Kimura, Eds. (Springer-Verlag, London, 2008), pp. 15–33. 5. L. Mota, L. P. Reis, N. Lau, Mechatronics 21, 434–444 (2011). 6. B. Horling, V. Lesser, Knowl. Eng. Rev. 19, 281–316 (2005). 7. F. P. B. Osinga, Science, Strategy and War: The Strategic Theory of John Boyd (Routledge, Abingdon, UK, 2007). 8. X. Yang et al., Robotics Biomim. 3, article 21 (2016). 9. L. Fisher, The Perfect Swarm: The Science of Complexity in Everyday Life (Basic Books, New York, NY, 2011).10. X. Wang, Z. Zeng, Y. Cong, Neurocomputing 199, 204–218 (2016).

knowledge base is further utilized to obtain collective cognition.(3) Autonomous decision and collective game

behavior: Collective game behavior is based on cooperative/noncooperative game theory and is achieved by synthesizing autonomous planning of individual robots. (4) Autonomous action and collective dynamics:

Collective robots achieve cooperative and harmonized motion using collective dynamic models, such as the Boids, leader–follower, and graph-based models (10).

micROSWe have proposed a morphable, intelligent, and

collective robot operating system (micROS) for the management, control, and development of collective robots (8). It provides a platform for standardization and modularity while achieving autonomous behavior and collective intelligence.

MicROS supports the construction of different col-lective structures. For example, Figure 3 illustrates an example of an organizational structure derived from Figure 1. MicROS also provides a wireless-based, scal-able, and distributed self-organized networking plat-form that implements distributed messaging-based communication mechanisms.

MicROS can be installed on each node of a collective robot to create a layered structure consisting of the core layer and the application programing interface (API) layer (Figure 4). The core layer is divided into resource-management and behavior-management sublayers. The former controls resource management in the physical, information, cognitive, and social domains. The latter, based on the OODA cognitive

FIGURE 3. The distributed architecture of micROS for the city scenario example.

FIGURE 4. Layered structure of micROS nodes. APP, applications; API, application programing interface.


Toward robust visual cognition through brain-inspired computing

Pengju Ren†, Badong Chen†, Zejian Yuan, and Nanning Zheng*

Visual perception is one of the most important components of remote autonomous systems and intelligent robots and is one of the most active areas of artificial intelligence research. Achieving a better understanding of the brain’s visual perception mechanisms will be critical for developing robust visual cognitive computing, building compact and energy-efficient intelligent autonomous systems, and establishing new computing architectures for visual perception and understanding of complex scenes.

Inspired by the cognitive mechanisms of brain networks, associative memory, and human visual perception, here we review a computational framework of visual representation and “scene understanding” (1), and explore possible approaches to robust visual cognition.

Brain networks and cognitive plasticityThe brain is a complex network comprising

interconnected tissues from different regions, with each region responsible for a different cognitive task. The functionality and tasks related to spontaneous brain activities are determined by a network topology that is hierarchical, multiscale, highly connected, and multicentric (2). In recent years, neuroscientists have gained insights into the cognitive mechanisms of the brain by investigating structural connections, as well as the aggregation and separation (convergence and divergence) of functional and effective connections (Figure 1). Structural connections comprise physiological connections within the cerebral cortex; the high level of redundancy determines the versatility and fault-tolerant characteristics of the brain. Functional connections are task-specific statistical correlations based on neurophysiological phenomena; their degree of reconstruction determines the brain’s cognitive plasticity (3). “Effective connections” describes the causal connections and mutual influences between different functional areas, which determine the efficiency and flexibility of cognitive ability.

Brain function is consistent with its structure.

In contrast to knowledge representations based on symbols and probabilities, the brain achieves informational organization, inference, and reasoning through the dynamic evolution of sophisticated spatiotemporal networks. Such plastic, dynamic, and nonlinear brain networks are the structural basis of the robust cognitive function of the brain, and are also a starting point in the development of brain-inspired intelligence.

Memory and knowledge migrationMemory, an important function of the nervous

system, is the neural activity in the brain responsible for storage and reproduction of acquired information and experience. The brain’s memory system can be divided into three categories: instantaneous, short-term, and long-term. Instantaneous memory is temporary storage in sensory organs. Short-term memory maintains information, performs fine processing, and rehearses new information from sensory buffers. Long-term memory comprises the accumulation of individual experience and the development of cognitive ability. The brain receives internal and external information collected from sensory organs and receptors, and then makes use of the knowledge memorized by the nervous system to complement, interpret, and judge the collected information. Due to unavoidable “noise” in the signals and incomplete observations, the brain has to correct and complement faulty or missing information based on its memories through different cognitive processing stages. In order to make adaptive behavior decisions, the nervous system must form an internal model that reflects the history of environmental changes and choose whether or not to modify or consolidate this model to form the long-term memory. Compared with the storage mechanisms of current computer systems, neural memory is characterized by the following features: a distributed structure for knowledge representation and storage, engram cell connectivity as a substrate for memory storage and retrieval, and tight coupling of memory and information processing (4, 5).

In the brain, memory is seamlessly integrated with the dynamic processes of its functional and effective connections throughout the entire cognitive information processing system, where relevant information is maintained, interpreted, judged, retrieved, revised, consolidated, and completed. “Knowledge migration” is the ability of humans to learn about new concepts by applying past knowledge to the new situation. More specifically, when the brain needs to learn a new model or respond to an unforeseen situation, it can mimic an existing, familiar memory, thereby executing different levels of knowledge transfer and achieving robust problem solving ability.

Visual perception and multiscale spatiotemporal scene learning

Since about 80% of external information perceived by the human brain is obtained through vision, visual

Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, China*Corresponding Author: [email protected]†Contributed equally to this work.


TOWARD ROBUST VISUAL COGNITION THROUGH BRAIN-INSPIRED COMPUTING 17

perception mechanisms are of great interest in the study of brain cognition. Different properties of visual information are transferred and processed by various interacting pathways that make up the brain’s functional connectivity. There are three major properties of visual cognition that interest us here. The first is related to studies of visual cognition showing that neural coding mechanisms cannot be detected from a single cell. Moreover, single neurons are incapable of representing knowledge and associated memory independently. Instead, in visual cognition, mental concepts are represented by the joint activation of groups or assemblies of cells, with each cell participating in diverse assemblies at different times. This property greatly reduces the required scale of neuronal networks and improves the flexibility of neuronal representation. In neurobiology, this well-known concept is referred to as “population or assembly coding.”

A second characteristic of visual cognition is its dynamic and selective nature, whereby it can selectively focus on a discrete aspect of information while ignoring other perceivable information. The third characteristic is the adaptability of visual cognition, whereby it can make functional changes based on interactions with the outside world.

In addition, visual cognition does not function independently of the environment, but interacts with it as part of a more complex cognitive behavior. In primary vision, while local perception components might compete (e.g., focusing on different visual items of interest), on a larger, more global scale, they cooperate to bring about acute visual perception. This feature conforms to the “small world” properties of brain networks, and shows consistency between structure and function. The local–global interaction behavior in visual cognition necessitates that multiscale spatiotemporal features are integrated into visual information processing. Recently, these multiscale spatiotemporal features have been successfully applied to visual salience detection and collaborative tracking (6, 7). In these studies, a set of multiscale features was used to describe a salient object locally, regionally, and globally (Figure 2A), while different visual features were learned from sample sets with different timescales to track objects in challenging scenarios (Figure 2B).

Visual cognition is not merely a passive response to the environment, but an active behavior. Biological visual channels use a “bottom-up” process to form an initial recognition result for visual objects, and a reverse “top-down” process to control eye movement and attention to complete “forecast-verification” cognitive processes. Humans have the ability to use memory information to search for a specific target from complex environments and to process target information. Selective attention can be divided into two levels: (1) primary attention systems, or (2) advanced attention systems, based on whether it is independent of semantics or content.

Framework and components of brain-inspired visual cognitive computing

A robust, visual cognitive computing system should embrace the latest developments in neuroscience and the cognitive and information sciences (8). A framework for visual cognitive computing is presented (Figure 3), and is based on the following fundamental principles:

(1) Computation must be performed close to the sensors to enable configurable and programmable image data delivery that supports attention-driven information acquisition;

(2) A scalable and flexible network will mimic the structural connections of the human brain, using either static or dynamic task mapping and scheduling strategies, and associated routing schemes. These strategies will also mimic the brain’s functional connections, using adaptive routing path selection, dynamic link-bandwidth assignment, and priority-based resource allocation. This configuration allows functional and effective connections to be adjusted and modified according to different requirements of applications and tasks;

(3) Content-based data retrieval schemes are used to manage and organize distributed memory units and to record the routes and orders of the intermediate nodes during neural activity transmission;

(4) Convergent–divergent connections in the functional and effective networks will accomplish population coding and associative memory; and

(5) Visual cognition tasks can be performed by designing a multiscale, spatiotemporal dynamic relational network.

FIGURE 1. Brain network connections. Left, structural connections; center, functional connections; right, effective connections.

http://www.baidu.com/link?url=LGFV6lnGq1u_bjAb4LMowJjYa3_7oEazicH5D06NtKqswbsYmVZhQ6V5y3eOeDphvPp45GFBz7y6o_Rf7K89qwaI6iGDx1ZapJ-e8rSrj20nLzH1IyC0M4sNyhCqdwNQ


The proposed framework contains the following major components:

(1) Event-/attention-driven information acquisition: Accomplishes the “bottom-up” process of passive information acquisition, such as changes of optical flow caused by motion, as well as the “top-down” process of active information retrieval, such as concept- or empirical knowledge-based object location (6, 7).

(2) Spatiotemporal coding: Allows rapid functional connection and reorganization to achieve contextual (or task-related) representation, by applying temporal- or rate-coding for spiking signals. This enables visual information representation and management and uses

spatiotemporal nonlinear mapping to achieve spiking activity propagation, transformation, and migration (9).

(3) Networked dynamic information processing: Various features of visual objects, such as shape, color, motion, and stereo view, are transmitted and processed in parallel and at multiple scales using hierarchical and scalable network structures to emulate the structural, functional, and effective connections of the human brain. Resource allocation is carried out with specific learning (or tuning) rules that mimic the synaptic plas-ticity of biological neural systems. Resilience against unanticipated events and uninformative background input, as well as recovery from failures, are required throughout the entire system design (10).

FIGURE 2. (A) Multiscale salience detection in image sequence. Modified from (6). (B) Multi-timescale collaborative tracking (7). CRF, conditional random field; SL, long-term target samples; SM, medium-term target samples; SS, short-term target samples; VL, long-term variables; VM, medium-term variables; VS, short-term variables; t, time.

A

B

TOWARD ROBUST VISUAL COGNITION THROUGH BRAIN-INSPIRED COMPUTING 19

modules to generate a reasonable visual representation that is accomplished by taking full advantage of conditions, prior knowledge, and hypotheses from the real world.

In order to continually improve and comprehensively verify this framework, it will be necessary to deploy visual cognitive computing in a complex and real-world environment. A self-driving, unmanned vehicle is a typical scenario that would require simultaneous execution of various complicated visual cognition tasks in real time. These include unstructured lane detection; pedestrian, vehicle, and traffic sign identification; and detection and tracking of multiple obstacles. Self-driving vehicles therefore represent an ideal platform for development and evaluation of visual cognition systems (11).

(4) Distributed associative memory: A distributed context-based memory management scheme that performs population coding for visual concept representations and associative retrieval of memories. This component autonomously facilitates information maintenance, interpretation, judgment, correction, reinforcement, and integration, using associative memory that is seamlessly integrated with the processing units.

(5) Constraints and guidance of conditions: This component implements an intelligent control unit during cognition processing to complement information that is lost by the projection of a 3D scene onto a 2D image sensor. It also reduces uncertainty during information integration across multiple hierarchies and

FIGURE 3. Diagram of proposed brain-inspired visual cognitive computing. A modularized, flexible, scalable and power-efficient computing architecture is achieved using the proposed model for the high-level synthesis of the computational requirement of desired functionalities, and the essential communication between these functionalities in order to carry out different tasks, together with a deeper understanding of the effective connections being emulated. Other critical features include systematically mapping logical neural networks to the physical neurosynaptic cores, logically scheduling execution orders, adaptively applying intra- and interchip routing algorithms and flow control schemes, and deliberatively designing a reconfigurable microarchitecture neurosynaptic core. These characteristics are crucial to fulfilling the functionality of the system. EPSP, excitatory postsynaptic potential; GALS, globally asynchronous locally synchronous; SRAM, static random-access memory.


1Department of Automotive Engineering, Clemson University, Greenville, South Carolina, USA2Department of Mechanical Engineering, Colorado State University, Fort Collins, Colorado, USA3Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, USA4Emerging Technologies Institute and Department of Industrial and Manufacturing Systems Engineering, University of Hong Kong, Hong Kong *Corresponding Author: [email protected]

Brain-like control and adaptation for intelligent robots

Yunyi Jia1, Jianguo Zhao2, Mustaffa Alfatlawi3, and Ning Xi4*

Traditional industrial robots have historically been caged in structured environments where they accom-plish simple and repetitive tasks. In contrast, the current generation of robots is required to be more intelligent so as to handle more complicated and dexterous tasks, and to work with humans in unstructured environments. These requirements bring new challenges for robots, such as the need for a higher degree of control, sensing, and decision-making. In this review, we discuss brain-like robot control and adaptation approaches for moving toward truly intel-ligent robotics.

Intelligent robots require brain-like perception and con-trol. One particular aspect of these features is the ability to employ vision as feedback to control a robot’s motion. Traditional vision-based control methods need to extract and track geometric features in an image (1), which is not a brain-like function. In fact, insects or animals directly employ image-intensity information in controlling their motion to perform various tasks such as docking, landing, or obstacle avoidance (2).

To achieve brain-like control, we propose a direct vision-based control approach that directly leverages all the intensities or luminance in an image to form an image set. By considering an image as a set, the approach formulates the vision-based control problem in the space of sets (3). Then, we can obtain a controller to steer the current image set to the desired image set. This derivation is performed in the space of sets, and the linear structure of the vector space no longer holds. We therefore refer to this approach as “non-vector space control.”

Recently, we have validated the non-vector approach using images obtained by atomic force microscopy (4, 5). We have also shown that the approach works even when only partial image information is known (6, 7), which can significantly increase the computation speed. With such a controller, a tailed robot is used to direct its midair orienta-tion once it falls from a certain height (8). Here, we will briefly illustrate an alternate function: how the non-vector method could be leveraged to control the motion of a robotic manipulator (9).

Conclusions and perspectivesVisual cognitive computing is playing an increasingly

important role in the development of intelligent autonomous systems and human-level robots. Developing a robust computing architecture inspired by brain networks, associative memory, and human visual perception is therefore a crucial area of brain-inspired robotics. We have proposed a framework and components of such a system on the basis of previous research on multiscale spatiotemporal scene perception and understanding. Going forward, we will begin to investigate large-scale, compact, and reconfigurable visual cognitive computing architectures and their implementation.

References 1. P. Wei, Y. Zhao, N. Zheng, S.-C. Zhu, IEEE Trans. Pattern Anal. Mach. Intell. (2016), doi: 10.1109/TPAMI.2016.2574712. 2. H. J. Park, K. Friston, Science 342, 1238411-1–1238411-8 (2013). 3. M. W. Cole et al., Nat. Neurosci. 16, 1348–1355 (2013). 4. Y. Li, N. Zheng, Prog. Nat. Sci. 8, 228–236 (1998). 5. R. Semon, The Mneme (George Allen & Unwin, London, 1921). 6. T. Liu et al., IEEE Trans. Pattern Anal. Mach. Intell. 33, 353–367 (2011). 7. D. Chen, Z. Yuan, G. Hua, J. Wang, N. Zheng, IEEE Trans. Pattern Anal. Mach. Intell. (2016), doi: 10.1109/TPAMI .2016.2539956. 8. P. A. Merolla et al., Science 345, 668–673 (2014). 9. L. Li et al., IEEE Trans. Neural Syst. Rehabil. Eng. 21, 532–543 (2013).10. P. Ren et al., IEEE Trans. Comput. 65, 353–366 (2016).11. N. Zheng et al., IEEE Intell. Syst. 19, 8–11 (2004).

AcknowledgmentsThis work was supported by the National Natural Science Foundation (61231018) and the National Basic Research Program (973 Program) (2015CB351703) of China.


BRAIN-LIKE CONTROL AND ADAPTATION FOR INTELLIGENT ROBOTS 21

Since intelligent robots are also required to work with humans, it is crucially important to consider the safety and efficiency of human–robot interaction. When humans work with each other, they can adapt to their coworkers to ensure cooperative performance. To imitate this function, we propose a brain-like robot adaptation approach and apply it to a specific human–robot interaction field: telero-botic operations.

Previous studies in telerobotic operations have been fo-cused mainly on two issues—stability and telepresence. The first issue concerns the stability of the telerobotic systems in the presence of various system constraints such as commu-nication delays, packet losses, and unstructured environ-ments (10, 11). The second issue examines how teleopera-tors control robots in order to visually perceive the remote environment in an efficient way while presenting a sympa-thetic visage (12, 13). Most of these studies were based on the assumption that the human teleoperator is always fully able to control operations. This assumption, however, does not always hold true. For instance, the teleoperator may be suffering from drowsiness, deprivation, and even inebria-tion. Under such conditions, teleoperation safety and effectiveness may be seriously affected due to the opera-tor’s decreased capacity. However, previous teleoperation studies have rarely considered these possibilities.

To enhance the performance of telerobotic operations, we first employed brain-like neural networks to identify the operational status of the teleoperator based on their online brain electroencephalogram (EEG) signals. We then designed a brain-like robot adaptation to adjust the robot behaviors in response to these signals. We proposed the “quality of teleoperator” (QoT) concept to represent the teleoperator’s operational status. Based on QoT, a robot adaptation mechanism can be designed to adjust the robot’s velocity and responsivity. Our experimental results, discussed below, have demonstrated that the efficiency and safety of telerobotic operations can be improved through such brain-like robot adaptation approaches.

Brain-like robot control for intelligent roboticsIn our model, the non-vector space control approach

considers the state of the system as a set instead of

a vector. A system in the non-vector space can be described as φ(x(t), u(t))∈Ṡ(t), where Ṡ(t), which is different from traditional derivatives with respect to time, is the set of all bounded Lipschitz functions satisfying particular conditions (9). u(t) = γ(S(t)) is the feedback mapping from the current feedback set S(t). Under this general framework, the stabilization problem in the non-vector space can be stated as follows: Given a controlled dynamics system φ(x(t), u(t))∈Ṡ(t), a constant desired set Ŝ, and an initial set S0 in the vicinity of Ŝ, a feedback controller

should be designed, u(t) = γ(S(t)), such that the feedback set S(t) will approach Ŝ asymptotically. To address this problem, we have devised the following theorem. For the system f(x)u∈Ṡ(t) with f(x)∈Rm×n, x∈Rm, u∈Rn, and S ⊂ Rm, the following controller can locally asymptotically stabilize it at Ŝ:

where dŜ(x) is the projection of a point x to a set Ŝ. The detailed form of D(S)∈Rn can be found in (9).

The vision-based control problem can be modeled by the system in the above theorem, and the controller can be readily applied. In fact, for servoing with greyscale images, each pixel can be represented by a 3D vector x = [x1,x2,x3]T, where x1 and x2 are the pixel indices, and x3 the pixel intensity. For a general visual servoing problem, the control input is the camera’s spatial velocity. Therefore, the control input, u(t), has three translational components and three rotational components, which can be represented by a vector, u(t) = [vx,vy,vz,ωx,ωy,ωz]T.

The system in the theorem is determined by φ(x), which is further determined by the relationship between u(t) and x(t). The perspective-projection-sensing model in computer vision can be used to derive such a relationship. Under constant lighting condition, x3 will be a constant for each pixel; therefore, ẋ3 = 0. With a unit camera focal length, a 3D point with coordinates P = [px, py, pz]T in the camera frame will be projected to the image plane with coordinates x1 = px/pz, x2 = py /pz. Based on these equations, the relation between u(t) and x(t) can be obtained as ẋ(t) = L(x(t))u(t), where

Note that the first two rows are the same as the interaction matrix in visual servoing. Since ẋ(t) = L(x(t))u(t) has the same form of the system in the theorem, the controller can be readily applied. To simplify the problem, however, we only allow the camera attached to the end effector to have planar motion (i.e., two translational planes of motion plus a rotation) about its

FIGURE 1. Experimental setup for non-vector based control.


In this case, there are three main steps in the identification model: EEG signal acquisition, QoT indicator generation, and QoT generation. EEG signal acquisition is realized by an Emotiv EPOC brain headset that has 14 saline sensors to collect the EEG signals from the channels, namely AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, and AF4. QoT indicator generation is realized using an affectivity software development kit provided by Emotiv, which can recognize the five QoT indicator values by analyzing the 14 EEG signals. The QoT generation is realized through a B-spline neural network. The detailed process of generating this model has been described previously (14, 16, 17).

In our recent research (15), we showed that the emo-tional states space, which is spanned by PAD states, can be used to identify QoT from a psychological perspective. The identification is achieved in two stages. In the first stage, 14 EEG signals are used to estimate PAD states. The EEG-based fractal dimensions are extracted as complexity-based features, and a Gaussian Process (GP) regression model is fitted to these features to estimate the PAD states after learning the GP regression model’s covariance kernel using Deep Belief Nets (DBN). In the second stage, the estimated PAD states are used as features to identify QoT by fitting another GP regression model with a spherical Gaussian kernel.

Brain-like robot adaptation according to QoTThe framework delineating how robots adapt to tele-

operation according to QoT is shown in Figure 3. The perceptive planning and control method and a QoT-based adjustment mechanism are designed to construct the teleoperation system. The perceptive planning and control method is applied to guarantee the stability of the system under network delays (11). However, the efficiency and safety of telerobotic operations may still be affected by the quality of the teleoperator. In particular, the effects of

optical axis.The non-vector space control

method is implemented on a redundant seven-degrees-of-freedom manipulator (e.g., LAW3, SCHUNK), which has seven revo-lute joints. The experimental setup is shown in Figure 1, where an ordinary complementary metal-oxide semiconductor (CMOS) camera is rigidly attached to the end-effector. As in our preliminary experiment, a planar object with a rectangle, a circle, and a triangle printed on a white paper is placed on the ground. First, the manipula-tor is moved to a desired position, where a desired image is acquired. The desired position is obtained using the manipulator’s forward kinematics. Then the manipulator is moved to some initial position, where the initial image is recorded. The manipulator will then start moving according to the non-vector space visual servoing (NSVS). The experimental results for this strategy are shown in Figure 2, where the initial and desired images are shown in panels A and B, respectively. The task space errors for two translations and one rotation are shown in panel C; the rotation error is plotted in panel C with de-gree as the unit. From the plot, there are some overshoots for the system, but the controller appears to be able to stabilize the system at the desired position.

Brain-like robot adaptation for intelligent robotics

Brain-like learning of QoTQoT reflects the operational status of the teleoperator.

It is impossible to use only one state of the teleoperator to reproduce this quality. Thus we have defined the concept we call the “QoT indicator.” Each QoT indicator is designed to represent a specific state of the teleoperator. These indicators may measure the internal mental states of the teleoperator, such as fatigue and excitement, as well as external physiological states such as blood pressure and heart rate. In our previous work (14), we proposed using the mental state of the teleoperator to identify QoT. Five mental QoT indicators were designed, including long-term excitement, short-term excitement, boredom, medita-tion, and frustration. In our more recent research (15), we proposed the pleasure, arousal, and dominance (PAD) model to construct a more comprehensive representation of the internal mental states.

Since the relationship between QoT and the QoT indica-tors involves a complex web of in-depth knowledge in vari-ous fields such as psychology and physiology, this makes it difficult or impossible to use an analytic model to represent this relationship. Fortunately, the brain-like neural network provides an alternative way to approximate the relationship.

A

B

C

FIGURE 2. Experimental results of non-vector based control. (A) Initial image. (B) Desired image. (C) Error in the task space.

BRAIN-LIKE CONTROL AND ADAPTATION FOR INTELLIGENT ROBOTS 23

low QoT become more evident when the teleoperator is performing tasks that require high dexterity.

In order to address this issue, we have proposed a QoT-based adjustment mechanism to imitate human adaptation. QoT is generated online and then sent to the robotic systems associated with the teleoperator’s com-mands. Based on QoT, the parameters of the robot plan-ner and controller can be adjusted to assign appropriate velocity and responsivity for the robotic system. Intuitively, with a high QoT, the robotic system needs to perform quickly and responsively so that the teleoperator can ef-ficiently control the robot and, in turn, achieve the teleop-eration goal as quickly as possible. On the other hand, if QoT is low, the robotic system needs to perform relatively slowly with a decreased responsivity, so as to reduce or avoid the potential impact of insufficient and incorrect commands from the teleoperator.

In addition, making adjustments solely based on QoT does not always make for more efficient operations. For instance, when the teleoperator is completing simple tasks, such as driving a robot along a straight line at a low speed, a low QoT may be sufficient for the teleoperator to accomplish the task. To address this issue, in the QoT adaptive control design, we have introduced a concept we call the “task dexterity index” (TDI) (16, 17) to represent the dexterity and complexity of the task. A collection of task indicators are generated online based on the teleopera-tor’s control commands and the sensory information from the robot. By using a fuzzy inference scheme, these task in-dicators are mapped into a normalized value to represent the TDI.

In order to enhance teleoperation performance, robot performance can be adjusted according to both QoT and TDI, so that the teleoperator can safely and efficiently operate the robot to accomplish different tasks at dif-ferent operation statuses. The adjustment to the robot comprises two parts: velocity adjustment and responsivity adjustment. For the robot velocity adjustment, we use the

velocity gain ki,s, which maps the teleoperator’s command-ed velocity to the robot velocity in the ith direction. The design adopts a linear mapping function that is described by vi,r = ki,s vi,tele where vi,r and vi,tele are the robot velocity and the teleoperator’s commanded velocity in the ith direc-tion, respectively. A larger velocity gain of ki,s leads to a relatively larger robot velocity, and a smaller velocity gain leads to the inverse.

For robot responsivity adjustment, the controller gains of the robotic system are used. For a robotic system where nonlinearities exist, by applying nonlinear and decoupling techniques such as nonlinear feedback control (18), the system models can be linearized and decoupled into several independent linear subsystems. Then, a propor-tion differentiation (PD) controller can be introduced to control the subsystems. For each controller, there are two gains, Ki = [ki1,ki2]. These gains are proportional to the robot responsivity, which makes them applicable for the responsivity adjustment.

Initially, a set of nominal gains, K~i, are predefined for both the velocity gains and controller gains. The nominal velocity gains are chosen as 1, and the nominal controller gains are chosen to achieve the best possible responsiv-ity of the robotic system based on the experimental tests. Then, based on the QoT and TDI, the desired gains can be computed by

where γ, DH, DL and are threshold constants, and λ is a weight to adjust the influence of the QoT on the robot controller gains. These parameters are all defined based on the teleoperation requirements. The first part of the formula indicates that the robot should perform as rapidly as possible with high responsivity when the QoT is high enough or the task is simple enough. The second part

FIGURE 3. Robot adaptation according to QoT in teleoperation. EEG, electroencephalogram; QoT, quality of teleoperator; e(s), system errors; y, system output; s, perceptive reference; +, positive; –, negative.


indicates that the robot should perform suitably fast or slow with suitable responsivity, according to both the QoT and TDI, when the QoT is low and the task is intermediately complicated. The third part indicates that the robot should perform with an appropriately low velocity and low responsivity, adapting only to QoT when it is low and the task is very complicated.

Once the desired controller gains are computed based on QoT and TDI, to ensure the system performance, they are then examined to determine whether they lie in the efficient region [defined in (16)]. If they lie in the region, the desired gains are set as the controller gains. Otherwise, the gains which lie in the region and are the nearest to the desired gains in the Euclidean space of the ki1— ki2 responsivity plane are selected by solving a quadratic programming problem (16).

The designed approaches are implemented on an Internet-based mobile manipulator teleoperation system for three types of tasks: manipulation, navigation, and mobile manipulation (16). Our results demonstrated that when robot self-adaptation was implemented according to QoT, the time cost, collision times, and attempt times for accomplishing the three types of tested tasks were all reduced, by 45%, 80%, and 40% respectively. In addition, with this adaptation design, teleoperators found that the robot’s operations became easier and smoother due to active adaptation of the robot according to the human’s operation status.

Conclusions and future directionsIn summary, brain-like robot control and adaptation

approaches have been developed to advance intelligent robotics. In the brain-like, non-vector based control ap-proach, control of the robot is achieved using sets instead of vectors. By applying this approach to visual servo control of robotic manipulators, it is possible to eliminate the requirement for image-feature extraction and tracking. In the brain-like adaptation approach, the robot is able to discern the operational status of humans and can then actively adjust its behaviors based on their status to en-hance the safety and efficiency of telerobotic operations. Our experiments in applying these two approaches to real robotic systems illustrated their effectiveness and advan-tages, by making robotic systems more intelligent.

For the successful application of robotics to intelligent manufacturing, several important issues will need to be addressed. First, robotic systems will need to be more flexible and reconfigurable. Existing fixed-base industrial robots will no longer satisfy the demands of modern markets. In the future, robotic systems will have more scalable work spaces, more flexibility for accomplishing various tasks, and more adaptability to various environments. Second, more advanced sensing systems will be required for intelligent robotics. Stable and reliable sensing systems such as vision, force, haptic, and tactile sensors will need to be integrated into robotic systems to allow them to perform various tasks in unstructured environments. Third, safety and efficiency concerns regarding human–robot interaction

will need to be dealt with. In the future, robots will be mostly uncaged and thus will frequently work alongside humans. Therefore, safety considerations will be critical for protecting humans from robots, and vice versa. Developing efficient approaches such as using natural language, gestures, physical contact, and even connectivity to the human mind, will be essential to give humans the capability to communicate with robots naturally and efficiently, and to enable robots to actively learn from humans.

References 1. S. Hutchinson, G. D. Hager, P. I. Corke, IEEE Trans. Rob. Autom. 12, 651–670 (1996). 2. É. Marchand, F. Chaumette, Rob. Auton. Syst. 52, 53–70 (2005). 3. L. Doyen, J. Math. Imaging Vis. 5, 99–109 (1995). 4. J. Zhao, B. Song, N. Xi, K. Wai, C. Lai, 2011 50th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC), (IEEE, Orlando, FL, December 2011), pp. 5683–5688. 5. J. Zhao et al., Automatica 50, 1835–1842 (2014). 6. J. Zhao, N. Xi, L. Sun, B. Song, 2012 IEEE 51st Annual Conference on Decision and Control (CDC) (IEEE, Maui, HI, December 2012), pp. 5685–5690. 7. B. Song et al., IEEE Trans. Robot. 30, 103–114 (2014). 8. J. Zhao, H. Shen, N. Xi, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (IEEE, Hamburg, Germany, September–October 2015), pp. 2154–2159. 9. J. Zhao et al., 2012 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (IEEE, Kaohsiung, Taiwan, July 2012), pp. 87–92 .10. R. J. Anderson, M. W. Spong, IEEE Trans. Automat. Contr. 34, 494–501 (1989).11. N. Xi, T. J. Tarn, Rob. Auton. Syst. 32, 173–178 (2000).12. I. Elhajj et al., Proc. IEEE 91, 396–421 (2003).13. T. Itoh, K. Kosuge, T. Fukuda, IEEE Trans. Rob. Autom. 16, 505– 516 (2000).14. Y. Jia, N. Xi, Y. Wang, X. Li, 2012 IEEE International Conference on Robotics and Automation (ICRA), (IEEE, St. Paul, MN, May 2012), pp. 451–456.15. M. Alfatlawi, Y. Jia, N. Xi, 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO), (IEEE, Zhuhai, China, December 2015), pp. 470–475.16. Y. Jia et al., Int. J. Robot. Res. 33, 1765–1781, (2014).17. Y. Jia, N. Xi, F. Wang, Y. Wang, X. Li, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (IEEE, San Francisco, CA, September 2011), pp. 184–189.18. T. Tarn, A. K. Bejczy, A. Isidori, Y. Chen, The 23rd IEEE Conference on Decision and Control, (IEEE, Las Vegas, NV, December 1984), pp. 736–751.

AcknowledgmentsThis research work was partially supported under U.S. Army Research Office Contract No. W911NF-11-D-0001 and grants from the U.S. Army Research Office (W911NF-09-1-0321, W911NF-10-1-0358, and W911NF-16-1-0572).

NEUROROBOTICS: A STRATEGIC PILLAR OF THE HUMAN BRAIN PROJECT 25

Neurorobotics: A strategic pillar of the Human Brain Project

Alois Knoll1* and Marc-Oliver Gewaltig2

Neurorobotics is an emerging science that studies the interaction of brain, body, and environment in closed perception–action loops where a robot’s actions affect its future sensory input. At the core of this field are robots controlled by simulated nervous systems that model the structure and function of biological brains at varying levels of detail (1). In a typical neurorobotics experiment, a robot or agent will perceive its current environment through a set of sensors that will transmit their signals to a simulated brain. The brain model may then produce signals that will cause the robot to move, thereby changing the agent’s perception of the environment.

Observing how the robot then interacts with its envi-ronment and how the robot’s actions influence its future sensory input allows scientists to study how brain and body have to work together to produce the appropriate response to a given stimulus. Thus, neurorobotics links robotics and neuroscience, enabling a seamless exchange of knowledge between these two disciplines.

Here, we provide an introduction to neurorobotics and report on the current state of development of the Euro-pean Union–funded Human Brain Project’s (HBP’s) Neu-rorobotics Platform (2, 3). HBP is Europe’s biggest project in information communication technologies (ICT) to date (www.humanbrainproject.eu) and is one of two large-scale, long-term flagship research initiatives selected by the European Commission to promote disruptive scientific advance in future key technologies. It will have a duration of 10 years and deliver six open ICT platforms for future re-search in neuroscience, medicine, and computing, aimed at unifying the understanding of the human brain and translating this knowledge into commercial products.

History and current status of neuroroboticsOne of the first researchers to apply the concepts of

neurorobotics was Thomas Ross, who in 1933 devised a robot with a small electromechanical memory cell that could learn its way through a very simple maze via “con-ditioned reflex” (4). In his 1948 book Cybernetics, Norbert Wiener laid the theoretical foundation for robotics as “the study of control and communication in the animal and the machine” (5). Since then, however, robotics research has focused largely on industrial applications.

Recently, industry has again been advancing technol-ogy by building robotic bodies that are softer, more flexible, and more durable than before. However, despite numerous attempts, no robot today matches the cogni-tive or behavioral abilities of even the simplest animals, let alone humans. What is still missing after decades of effort? Robots are still lacking a brain that allows them to learn to adapt to and exploit the physics of their bodies.

Today, there are two main schools of thought in neuro-robotics research. The first focuses on biologically inspired robots with bodies, sensors, and actuators that mimic structural and functional principles at work in the bodies and organs of living creatures. The second concentrates on brain-inspired control architectures.

Biologically inspired robots are adaptable and can dis-play rich perceptual and behavioral capabilities. In contrast to industrial robots, they often use compliant materials that make their mechanics intrinsically flexible. Examples of advanced biology-inspired humanoid robots are the iCub, a humanoid robot “child” (6); Kojiro, a humanoid robot with about 100 artificial muscles (7); and ECCERO-BOT (Embodied Cognition in a Compliantly Engineered Robot), a humanoid upper torso that attempts to replicate the inner structures and mechanisms of the human body in a detailed manner (8). More recently, engineers have created Roboy, a soft human body with a spine and a head that can display several emotional patterns (9); the ASIMO robot (10); and, most recently Atlas, the latest agile anthropomorphic robot developed by Boston Dynam-ics (11). The last two are constructed from hard materi-als and implement research in machine walking. Such advances in multilegged robots have led to a renewed interest in applications for the military [BigDog, also from Boston Dynamics (12)]; disaster management [iRobot’s PackBot (13), a snake-like robot developed by Hitachi-GE Nuclear Energy and IRID (International Research Institute for Nuclear Decommissioning) (14), and a scorpion-like robot from Toshiba (15)]; and entertainment [the NAO and Pepper robots from Aldebaran Robotics (16)]. Researchers have also developed robots that can mimic the character-istics of animals, such as locomotion in snakes and spiders, whisking in rodents, or jumping in frogs.

Brain-inspired control architectures are robotic control systems that on some level reflect properties of animal nervous systems. In general, they are tailor-made for a specific set of tasks, often using a combination of artificial neural networks, computer vision/audition, and machine learning algorithms. Early work in this field was done by Miyamoto et al. (17), who developed an artificial neural network with brain-inspired hierarchical architecture that could control an industrial manipulator by acquiring a dynamical model of the robot through the “feedback-error-learning” method. Later, Edelman et al. developed the “Darwin” series of recognition automata (18) to study different behavioral phenomena ranging from pattern recognition to association to autonomous behavior and sensorimotor coordination. Cox and Krichmar derived a robot controller based on the neuromodulatory systems of mammalian brains (19). Burgess et al. used a mobile robot

1Technical University of Munich, Department of Informatics, Chair of Robotics and Embedded Systems, Garching bei München, Germany2Ecole Polytechnique Fédérale de Lausanne, Blue Brain Project, Campus Biotech, Geneva, Switzerland*Corresponding Author: [email protected]


to test a model of the rat hippocampus (20). Ijspeert’s group investigated low-level mechanisms of locomotion in vertebrates by constructing a salamander robot con-trolled by central pattern generators (21). Further work has included self-organizing synchronization in robotic swarms (22) and the learning of compositional structures from sensorimotor data (23).

Lund et al. used spiking-neuron models to derive a model for cricket phonotaxis that they evaluated on a mo-bile robot (24). Since then, spiking neural networks have become a standard tool in brain-based robotics and are used to control complex biomechanical systems such as the musculoskeletal model of the human body developed by Nakamura’s group (25). For example, Sreenivasa et al. published a study on modeling the human arm stretch reflex based on a spinal circuit built of spiking neurons (26). Kuniyoshi et al. developed a fetus model to simulate the self-organized emergence of body maps in a nervous system of spiking neurons (27, 28). Finally, Richter et al. controlled a biomimetic robotic arm with a model of the cerebellum that was executed in biological real-time on the SpiNNaker neuromorphic computing platform (29, 30).

A third line of research that is usually not mentioned along with neurorobotics—but with which it greatly over-laps—is neuroprosthetics, which can in fact be viewed as the translational branch of neurorobotics. To advance neu-rorobotics further, a close collaboration between neurosci-entists, roboticists, and experts in appropriate computing hardware is clearly mandatory.

The Human Brain Project’s Neurorobotics PlatformThe Neurorobotics Platform (NRP) is a novel tool that

allows scientists to collaboratively design and conduct in silico experiments in cognitive neuroscience, using a neurorobotics-focused approach. The NRP thus plays an integrative role in HBP as it provides the tools for all scien-tists to study brain models in the context of their natural habitat—the body.

The overarching goal of HBP is to help unify our under-

standing of the human brain and to use this knowledge to develop new products and technologies. HBP features a huge and diverse spectrum of research and develop-ment activities ranging from human and mouse brain-data collection to the discussion of ethical issues arising from brain simulation and brain-based technologies. These activities are organized in a network of closely collaborat-ing subprojects. As shown in Figure 1, the neurorobotics subproject collaborates directly with almost all other subprojects. Six of these subprojects are to develop ICT platforms that will make technology developed within HBP available to the public. The Neuroinformatics Platform will deliver searchable brain atlases to enable access to a vast amount of digitized brain data. The Medical Informatics Platform will gather and process clinical data for research on brain diseases. The Brain Simulation Platform will pro-vide detailed large-scale brain simulations that will run on the supercomputer infrastructure of the High-Performance Analytics and Computing Platform and on the novel neuromorphic hardware of the Neuromorphic Computing Platform. The NRP will connect the brain simulation and computing platforms through an advanced closed-loop robot simulator for neurorobotics experiments whereby robots can be controlled by a simulated brain model.

All these platforms contribute to the virtualization of brain and robotics research. Virtual brain research, when supported by powerful and well-calibrated models, will enable researchers to perform experiments impossible in the real world, such as the simultaneous placement of 1 million probes in a mouse brain. By the same token, the advantage of virtual robotics research is the enormous acceleration of experiment and development speed, alle-viating dependency on real hardware in real environments. The possibility for smooth “mode changes” between real-world and virtual design phases is certainly essential, and only when this step becomes simple will an iterative give-and-take of insights between both worlds be possible.

In the remainder of the article we will introduce the “neurorobotics workflow,” a principled step-by-step ap-

FIGURE 1. A schematic of the Human Brain Project showing collaborations between the neurorobotics subproject and other subprojects shown in the list on the left.


proach to designing neurorobotics experiments for both roboticists and neuroscientists. The topics of the following section will cover details about the implementation of this workflow in the NRP, and the underlying tools and software architecture. We will then present a series of pilot exper-iments that were designed using the NRP to validate the workflow and to benchmark the performance and usability of its first public software release to the scientific communi-ty. Finally, we will summarize results achieved so far in the neurorobotics subproject and provide an outlook on the future development of the NRP.

MethodologyThe approach for the NRP consists of providing a num-

ber of design applications for models (of environments, robot bodies, and brains) and simulation engines that are integrated into a web-based frontend. Using this web fron-tend, users at different locations can rapidly construct a ro-bot model, its brain-based controller, an environment, and an execution plan. We call this ensemble a “neurorobotics experiment.” The NRP also allows reusing and sharing of previously defined experiments, thus opening a new area of collaborative neurorobotics research.

Only through integration of the appropriate simulation engines can we expect to achieve meaningful results, es-pecially if we want to close the loop between the brain, the robot, and its environment such that the simulated robot perceives its simulated environment through its simulated sensors and interacts with it through its simulated body. The simulation technologies used must provide the high-est available fidelity and the best possible match between an observation in the real world and its counterpart in the virtual world.

In brain modeling, there are currently two extremes. The first are models that focus on the functional properties of

nervous systems. They define control architectures and neural network models, possibly trained by deep learning algorithms, with the aim of solving a particular set of tasks. Examples are the Spaun model (31) and control architectures commonly found in cognitive robotics. At the oth-er extreme are digital reconstructions of neural circuits (32) or even entire brains of mice and rats, for example, based on experimental data. These models focus foremost on the struc-tural and dynamical details of the re-constructed system and regard brain function as an emergent phenome-non of these brain reconstructions.

While many researchers argue in favor of one or the other position, we propose that the most productive route is to combine the two ap-proaches. For example, many theories exist for higher-level brain functions like visual perception, but not all of

these theories can be true at the same time. Some may be appropriate for humans, whereas others may be applica-ble to cats or rodents. The only way to separate suitable theories from less suitable ones is to give researchers a tool that allows them to confront a given theory of brain function with the anatomical and physiological realities of a particular brain embedded in a concrete body, be it mouse, cat, or human. The NRP aims to be such a tool, fol-lowing the time-tested approach of analysis by synthesis.

A typical neurorobotics experiment might involve the simulation of a rat as it navigates through a maze. In this case, the control architecture for the simulated rat could comprise sensory areas, a hippocampus, and a motor area to generate movements. Traditionally, these components are individually selected and specifically adapted to match the robot, the environment, and the task. The neurorobot-ics workflow departs from this approach: Rather than de-signing specific neural control architectures for each robot and each experiment, it provides elements for the rapid design and training of application-specific brains and for embedding these brains in appropriate robotic embod-iments, with the theme of “modular brains for modular bodies.” It will incorporate both the experimenter’s own research as well as results from others.

The neurorobotics workflow starts with the choice of the system to be examined and ends with the execution of the experiment. It has four elementary steps, as described below and illustrated in Figure 2.

Step 1: Choose in silico experiment to runFirst, the researcher needs to specify the system to be

investigated by deciding on the robot body, the environ-ment, the task for the robot to solve, and the brain (the neural controller) that will control the body. We will refer to the specific combination of body, environment, and neural

FIGURE 2. The neurorobotics workflow.


system as a “neurorobotic system.” The concrete decisions for building each neu-rorobotic system will influence all other steps of the workflow.

Step 2: Choose desired level and platformIn the second step, the implementation

of the neurorobotic system chosen in step 1 needs to be specified. The simulated nervous systems can be realized at vary-ing levels of detail ranging from con-ceptual abstract models to spiking-point neuron models (33) or highly detailed neural simulations (32). Similarly, the robotic body can be built of stiff material with standard actuators from industrial robotics or can be based on a more com-plex design of soft structures (34), includ-ing biomimetic musculoskeletal actuators with many degrees of freedom.

Every component may be realized physically or in simulation. But why use simulated robots when our ultimate goal is a brain model that can control a real robot in a physical environment? Real robots have the advantage that all real-world complexities are considered. The same applies to the environments in which the robot is meant to perform its task. Computer models of robots and environments are complex and still inaccurate. However, robot simulations can be easily accessed without the need for purchasing expensive hardware. They also enable quick changes of the robot design and do not suffer from mechanical failure. During the simulation, the system state of the robot can be fully inspected at any point in time. Specific setups like the parallel execution of hundreds of simulations or the simulation of biological muscles cannot be realized at all in a physical setup. In the end, the decision whether to use simulated or physical robots is not influenced by scientific considerations, but rather by our ability to efficiently implement the brain model into the robot.

Like the robot, brain models can be executed either as a pure simulation or as a physical model, with the difference being that simulations tend to be more accurate than any physical model we can build to date. Thus, if we are inter-ested in brain models that capture many of the properties of real brains, we must resort to simulation. These simula-tions, however, run much more slowly than real-time, and this limitation forces us to also simulate the combination of robot and environment, because we cannot slow down the physics that govern them. To test and implement simpli-fied and abstracted brain models, we must find physical implementations that will also allow us to use real robots and environments.

In HBP, physical brain models are developed through the neuromorphic computing subproject, either by emu-lating neural dynamics based on analog integrated circuits (35) or by SpiNNaker, a digital neuromorphic system built of standard smartphone microprocessors interconnected by an efficient communication mesh (30).

Step 3: Set instrumentation and alignmentAfter step 2 of the workflow, both the brain simulation

and the robot are fully specified. Now, the researcher must define the connection between brain and body, specifi-cally the mappings that translate the neural output of the brain into motor commands for the robot, and translate the sensor readings into input for the brain. An example of this mapping process is depicted in Figure 3, where touch receptors located on the skin and the whiskers of a mouse body are mapped to the corresponding cortical areas of a simulated mouse brain. Defining these mappings within the closed-loop integration of a brain simulation and a body model comprises the process stage that is the core of the workflow.

Step 4: Run the experimentIn the final step of the workflow, the researcher executes

the neurorobotics experiment. In this phase, the brain model and the robot model run in parallel with the output of one system being the input of the other and vice versa. It is of vital importance that both simulations are synchro-nized and run on the same timescale, since only then can the behavior observed in the neurorobotics experiment be validated with data from neuroscience and biology. During the execution of the experiment, the researcher can monitor and control all states and parameters of the experiment (brain, body, and environment) including the option to pause and restart it, which is technically impossi-ble to achieve in the laboratory.

In essence, the NRP consists of different model designers and simulation engines that allow rapid construction of neurorobotics experiments. It enables researchers to design virtual robot bodies, to connect these bodies to brain models, to embed them in rich, virtual environments, and, ultimately to calibrate the brain models to match the specific characteristics of the

FIGURE 3. Connecting a virtual mouse model to the primary somatosensory cortex of a mouse brain simulation. The left and right sides of the figure depict the visualizations of the body and brain simulations, respectively. The mapping between brain regions and touch receptors of the mouse body is displayed in the center panels.


robot’s sensors and “muscles.” Collections of predefined models for brains, bodies, environments, sensors, and actuators will also allow nonexpert users to quickly set up new experiments. Researchers will further be able to use traditional techniques from neuroscience, such as lesion studies or manipulations of single neurons, to identify the right control architecture for the specific task. The resulting setups allow researchers to perform in silico experiments, initially replicating previous work, but ultimately breaking new ground. The individual steps from the collection of brain data to synthesizing a brain model based on this data and finally to connecting the brain simulation to a body model are illustrated in Figures 4–6.

It goes without saying that for virtualized neurorobotics research to be meaningful, the models for all constituents (environment, robot bodies, and brains) must reflect and integrate current knowledge of all entities, including knowledge about all abstraction levels of the brain, from the cell level to information process-ing. The neurorobotics subproject will therefore openly collaborate with re-search groups worldwide—both within and outside HBP—and continuously integrate the state of the art in brain and robot research.

Even though the NRP is designed with a focus on simulation, it is impor-tant to realize that the virtual design and virtual execution tool chain imple-mented by the NRP can, at any step along the process, be translated to the real world; thus, real robots can be designed from their models in the vir-tual world. They will produce the same behavior as their virtual counterparts, provided the models are permanently calibrated. For industrial practice, this means there can be a very rapid design-test-revise cycle that produces robots with integrated behavior based on “brain-derived technologies.”

ImplementationThe NRP consists of a frontend,

called the “neurorobotics cockpit,” and a simulation backend. The cockpit gives access to a set of tools that implement the first three steps of the neurorobotics workflow. The tools allow users to select and configure ro-bots and environments, configure the brain model, and set up neurorobotics experiments. Different visualization channels allow control and visualiza-tion of the simulation and interaction

with the experiment while it is running. The backend orchestrates the different tools for the simulation of robot, environment, and brain model, and manages the signal flow between these tools and the frontend. An overview of the complete system architecture is depicted in Figure 7.

Frontend: DesignersThe robot designer creates and adapts robot models,

the environment designer sets up the virtual environment with which the robot will interact, the brain interfaces and body integrator (BIBI) specifies the mapping between the simulated brain and the sensors and actuators of the robot, and the experiment designer defines the neuroro-botics experiment.

FIGURE 4. Schematic of brain modeling based on multiscale integration of data from different sources.

FIGURE 5. Synthesizing a brain model from measurement data.


The robot designer implements all functionality necessary to design robotic bodies, adapt their appearance, set their dynamic parameters, and equip them with sensors and actuators. Since users in neuroscience will be less interested in designing robotic bodies, needing them simply as tools for conducting closed-loop experiments, the designer includes libraries with ready-to-use models of robots, sensors, and actuators. In the current version of the NRP, users can choose from a model of the iCub robot (6), the six-legged walking machine LAURON (37), or the mobile robot Husky (38). Most importantly, the designer also includes a virtual mouse model that allows one-to-one comparison to experimental neuroscience studies. Future releases of the NRP will incorporate further refinements and improvements to this model, such as the addition of a realistic simulation of the musculoskeletal system. For the advanced user, the NRP offers a robot designer plug-in for the popular open-source 3D modeling software Blender (36).

Similar to a real neuroscience experiment, a neurorobotics experiment is conducted in a specific environment. The environment designer is a tool for designing virtual environments tailored to the needs of the experiment. Moreover, users will be given the option to share their models with other scientists using the common HBP infrastructure.

The mapping between brain and body lies at the core of every neurorobotics experiment. In correspondence with step 3 of the neurorobotics workflow, BIBI pro-vides tools for both spatial and representation mapping. Presently, there is only support for small brain models, which means that no spatial mapping is required. The representation mapping between the brain model and robotic sensing and actuation is accomplished by imple-menting so-called “transfer functions,” which translate the data exchanged between the brain simulation and the robot simulation. Beyond mere signal translation, transfer functions can also directly control the robot’s actuators in response to sensory input and thereby bypass further processing in the brain in a reflex-like manner.

We are now adding support for more realistic brain models in close collaboration with HBP’s brain simulation subproject, which is developing the brain builder and

the brain atlas embedding mod-ule as part of its Brain Simulation Platform. Follow-ing the paradigm of predictive neuroscience, this platform builds upon and extends the bottom-up biological recon-struction process to brain modeling, which has recent-ly enabled the digital recon-struction of the microcircuitry of the rat somatosen-sory cortex (32).

The brain builder is at the core of this process. It allows researchers to synthesize brain models from a large-scale database delivered by the neuroinformatics subproject, which is designed to store the vast amount of available ex-perimental knowledge about the brain. General principles of brain structure and function that are beyond the scope of experimental data are represented algorithmically and validated against biological knowledge. Depending on the intended application and the research question, the brain builder will support the generation of brain models at different levels of granularity ranging from the molecular level to microcircuits to whole brains. Both brain models and experimental data can be visualized and searched using the brain atlas embedding module.

After selecting a brain model from this component, BIBI will allow researchers to set up a spatial mapping by selecting neurons graphically using the brain atlas em-bedding module, and to map these neurons to the robot model. The neurorobotics subproject will develop an open standard for storing and sharing the brain–body mappings created with the BIBI tool.

The experiment designer combines the output of the other three designers in a virtual protocol for neurorobot-ics experiments. Future releases of the NRP will contain an intuitive graphical user interface for setting the number of runs, terminating conditions, and measurement values to be recorded during experiments.

Frontend: VisualizationThe second group of frontend software components ad-

dresses the visualization of the neurorobotics experiment. Depending on the experimental setup, this visualization can be computed from live data in a currently running simula-tion or generated from a previously recorded experiment.

The NRP supports two different types of visualization. The first is the web-based experiment simulation viewer, which runs in a browser and can therefore be used on any standard personal computer or even on mobile devices,

FIGURE 6. Defining a mapping between the brain model and the robot model.


making the NRP easily accessible to interested users with-out the burden of buying dedicated hardware. However, compared to real-world experiments, the interaction is less intuitive and the fidelity of the visualization is limited by the screen size and performance of the graphics process-ing unit.

The second type of visualization supported by the NRP is the high-fidelity experiment simulation viewer, which delivers a much more immersive experience by rendering life-size visualization on a display wall or in a cave auto-matic virtual environment (CAVE). While this approach requires complex installation of displays and dedicated hardware, it delivers the highest degree of realism and even opens up the possibility for mixed-reality setups, where persons or objects in front of the display wall or CAVE can become part of the experiment.

Backend: SimulationThe backend of the NRP processes the specifications

of the experiment and coordinates the simulators. The world simulation engine computes and updates the states of the robot and the environment based on the system descriptions from the corresponding designers. It is based on the highly modular Gazebo simulator (39), which can be easily augmented with new simulation modules. The brain model is simulated by the well-established neural simulation tool (NEST) (40), which ensures that even brain models of the largest scale can be simulated using HBP’s extensive computer resources. In the future, the user will not only be able to choose the desired brain model but also whether the simulation will be computed on standard hardware using a point-neuron simulator, on one of the two neuromorphic hardware simulators that are provided

by the neuromorphic computing plat-form, or on a supercomputer running a detailed simulator. Both simulations are coordinated and synchronized by the closed-loop engine that connects and manages the brain simulation and robot simulation according to the map-ping defined in the BIBI component. In correspondence with the different levels of modeling supported by the brain builder, the brain simulation subproject offers different simulators for molecular-, cellular-, and network-level analyses. Whereas network-level simulations capture less detail about individual neurons and synapses, they are based on established point-neuron models that support large-scale models of whole brain regions or even com-plete brains.

The NRP is fully integrated into the HBP Collaboratory, a web portal providing unified access to the six ICT platforms developed in HBP. The portal implements a common set of services for storage or authentication that is

automatically available to users of the NRP and allows an easy exchange of data. The integration of the novel work-flows and data formats developed into a common portal makes new research results immediately available to all members and collaborators. The NRP therefore automati-cally benefits from the work of the entire neuroscientific community, making it not only the most advanced tool in the field, but one that sets a new standard for state-of-the-art neurorobotics research.

Experimental resultsWe validated both the neurorobotics workflow and the

NRP in four different experiments that are available in the first public release of the platform. Because the interface to the brain simulation platform is under intense develop-ment, all experiments currently rely on simplified neural controllers simulated by NEST (40).

Basic closed-loop integrationThe first proof-of-concept experiment carried out on

the NRP was a simple Braitenberg vehicle (41), in a version designed for the Husky wheeled robot and the LAURON hexapod. The setup is based on a model of the Husky ro-bot that is equipped with a camera and capable of moving around in a virtual room included in the NRP. The camera output is processed and forwarded to the NEST simulation of a neural network implementing basic phototaxis. As shown in Figure 8, the room contains two displays located on opposite walls. During the simulation, the user can interactively set the color of each display.

To demonstrate the modularity of our approach, the Braitenberg experiment was also implemented for the walking machine LAURON (Figure 8). Both the Husky

FIGURE 7. Components and architecture of the Human Brain Project’s Neurorobotics Platform (NRP).


with arbitrary complexity. A SpiNNaker system (30) was connected to the robot via a dedicated hardware interface that translated between the communication protocols of SpiNNaker and Myorobotics (44). The arm was controlled by a cerebellum model simulation adapted to run on the SpiNNaker architecture from the system described by Luque et al. (45). In particular, we augmented the SpiNNa-ker software framework to support the supervised learning rule defined by the model. Based on this rule, the system successfully learned to follow a desired trajectory.

Conclusions and outlookFollowing the neurorobotics subproject’s overarching

theme, “modular brains for modular bodies,” we have suc-cessfully integrated the most advanced tools and tech-nologies from brain simulation, robotics, and computing in a unified and easy-to-use toolset that will enable research-ers to study neural embodiment and brain-based robot-ics. This integrative approach combines and connects the results of all HBP subprojects, rendering the neurorobotics subproject a strategic pillar of HBP.

The NRP (neurorobotics.net) is the first integrated and Internet-accessible toolchain for connecting large-scale brain models to complex biomimetic robotic bodies.

and LAURON robots were directed by the Braitenberg brain to walk in the direction of the red stimulus.

Simulation of a humanoid robot and a retina model

Studies of human brain models will require realistic humanoid embodiments. To this end, we integrated the iCub robot model delivered with the NRP. A sample experi-mental setup is depicted in Figure 9. The robot model is equipped with two cameras positioned at the eyes and is aimed toward one of the screens in the virtual room. Simple spiking neural networks similar to that used in the Braitenberg experiment control the eye motion. However, in this case, the network causes the robot’s eyes to track a moving visual stimulus displayed on the screen. As can be seen in the figure, the user can not only display the spike raster plot, but can also access the recordings of the two cameras placed in the head of the iCub model. We are currently integrating a much more realistic model from a computational framework for retina modeling (42).

Mouse model and soft-body simulationThe virtual mouse model is an essen-

tial component of the NRP and a key to comparative studies bridging traditional neuroscience and virtual neurorobotics experiments. Compared to the other robot models considered thus far, the simulation of a mouse body is especially demanding due to its soft body structure. The mouse experiment included in the first release of the NRP therefore adopts the same simple protocol as the other prototype experiments and focuses on the soft-body simulation. The result is shown in Figure 10. The mouse model is placed at a junction point of a Y-shaped maze. Each direction leads to a dead end with a screen. As in the Braitenberg experiment, the neural controller directs the mouse to move its head toward the red stimulus. A more detailed, realistic-appearing mouse model has been completed, and a prototype that maps the sensory areas of this model to corresponding cortical areas of a detailed point-neuron model of the mouse brain has been successfully tested. The results will soon be available on the NRP.

Closed-loop neuromorphic control of a biomimetic robotic arm

To complement the purely virtual experiments run-ning on the NRP, we also implemented an initial physical neurorobotics experiment (29). The robot was a single-joint biomimetic arm with two antagonistic tendon-driven artificial muscles (Figure 11). It was assembled from design primitives of the modular Myorobotics toolkit (43), which allows the easy assembly of biomimetic robotic structures

FIGURE 9. Live visualization of the iCub eye-tracking experiment.

FIGURE 8. Live visualization of the LAURON Braitenberg experiment in the web-based experiment simulation viewer.


Based completely on simulations, the platform enables the design and execution of neurorobotics experiments at an unprecedented speed, which is accelerating scientific progress both in neuroscience and robotics. The neuroro-botics workflow guarantees that the results can be rapidly transferred to real robots.

A set of pilot experiments running on the first public release of the NRP has yielded positive results and helped to refine the development roadmap. In the upcoming releases, we will focus on the integration of more realistic brain models comprising a very large number of neurons.

To this end, we are currently augmenting the neural simula-tion interface with support for distributed setups in which many instances of NEST are running in parallel.

Additionally, we are imple-menting an interface to the SpiNNaker platform to allow neuromorphic simulation setups; an initial prototype system is already available. We are also making a concerted effort to ensure that the NRP is as attractive as possible to us-ers. One example is a domain-specific language for defining transfer functions (46), which will serve as the basis for a graphical transfer function edi-tor. On the modeling side, we will provide further environ-ments and robots.

In the next phase of HBP, we will expand our collaboration with both internal and exter-nal partners to ensure that the NRP continues to reflect the cutting edge of both robotics and neuroscience. In partic-ular, we are investigating embodied learning techniques, which are an essential prerequisite to endowing simulated brains with desired behaviors. We are also researching biomimetic robots with human-like musculoskeletal actua-tion, since these models are an important requirement for transferring results from the simulation to the real world.

We cannot stress enough that the development of the NRP and the neurorobotics subproject are complete-ly open to the entire scientific community. Interested researchers from both academia and industry are strongly encouraged to become involved and contribute. For the upcoming public release of the NRP, we are extending our hardware resources to accommodate as many users as possible. Regular meetings and workshops organized by our subproject are open to everybody, which will not only promote the NRP but also establish a community for neurorobotics research.

References 1. A. K. Seth, O. Sporns, J. L. Krichmar, Neuroinformatics 3, 167–170 (2005). 2. Human Brain Project, website (2016); available at https:// www.humanbrainproject.eu. 3. Neurorobotics in the HPB, website (2016); available at http:// www.neurorobotics.net. 4. T. Ross, Scientific American 148, 206–209 (1933). 5. N. Wiener, Cybernetics: Or Control and Communication in the Animal and the Machine (Technology Press, Cambridge, MA; John Wiley & Sons, New York, NY; Hermann & Cie, Paris, 1948). 6. G. Metta, G. Sandini, D. Vernon, L. Natale, F. Nori, Proceedings of the 8th Workshop on Performance Metrics for Intelligent

FIGURE 10. Mouse model experiment with soft-body physics simulation and live visualization of spike trains.

FIGURE 11. Closed-loop neuromorphic control of a single-joint robot assembled from design primitives of the Myorobotics toolkit (left). The robot is controlled by a cerebellum model that runs on a SpiNNaker system (right). © 2016 IEEE. Reprinted, with permission, from (29).

BIOLOGICALLY INSPIRED MODELS FOR VISUAL COGNITION AND MOTION CONTROL: AN EXPLORATION OF BRAIN-INSPIRED INTELLIGENT ROBOTICS 3534 BRAIN-INSPIRED INTELLIGENT ROBOTICS: THE INTERSECTION OF ROBOTICS AND NEUROSCIENCE

Systems (Permis ‘08), R. Madhavan, E. Messina, Eds. (ACM, Gaithersburg, MD, August 2008), pp. 50–56. 7. I. Mizuuchi et al., Proceedings of the 2007 7th IEEE-RAS International Conference on Humanoid Robots (IEEE Robotics and Automation Society, Pittsburgh, PA, November– December 2007), pp. 294–299. 8. H. G. Marques et al., Proceedings of the 2010 10th IEEE-RAS International Conference on Humanoid Robots (2010), (IEEE Robotics and Automation Society, Nashville, TN, December 2010), pp. 391–396. 9. Roboy Anthropomimetic Robot, website (2016); available at http://www.roboy.org.10. Honda Motor Company, “ASIMO: The World’s Most Advanced Humanoid Robot” (2016); available at http://asimo. honda.com.11. E. Ackerman, E. Guizo, “The Next Generation of Boston Dynamics’ ATLAS Robot Is Quiet, Robust, and Tether Free” (2016); available at http://spectrum.ieee.org/automaton/ robotics/humanoids/next-generation-of-boston-dynamics- atlas-robot.12. M. Raibert, K. Blankespoor, G. Nelson, R. Playter, in Proceedings of the 17th IFAC World Congress, M. J. Chung, P. Misra, H. Shim, Eds. (International Federation of Automatic Control, Seoul, Korea, July 2008), pp. 10822–10825.13. B. M. Yamauchi, Proc. SPIE 5422, 228–237 (2004).14. E. Ackerman, “Unlucky Robot Gets Stranded Inside Fukushima Nuclear Reactor, Sends Back Critical Data” (2015); available at http://spectrum.ieee.org/automaton/ robotics/industrial-robots/robot-stranded-inside-fukushima- nuclear-reactor.15. E. Ackerman, “Another Robot to Enter Fukushima Reactor, and We Wish It Were Modular” (2015); available at http:// spectrum.ieee.org/automaton/robotics/industrial-robots/ heres-why-we-should-be-using-modular-robots-to-explore- fukushima.16. Aldebaran Robotics, website (2016); available at https://www. aldebaran.com/en.17. H. Miyamoto, M. Kawato, T. Setoyama, R. Suzuki, Neural Netw. 1, 251–265 (1988).18. G. N. Reeke, O. Sporns, G. M. Edelman, Proc. IEEE 78, 1498–1530 (1990).19. B. Cox, J. Krichmar, IEEE Robot. Autom. Mag. 16, 72–80 (2009).20. N. Burgess, J. G. Donnett, J. O’Keefe, Connect. Sci. 10, 291–300 (1998).21. A. Crespi, K. Karakasiliotis, A. Guignard, A. J. Ijspeert, IEEE Trans. Robot. 29, 308–320 (2013).22. V. Trianni, S. Nolfi, IEEE T. Evolut. Comput. 13, 722–741 (2009).23. Y. Sugita, J. Tani, M. V. Butz, Adapt. Behav. 19, 295–316 (2011).24. H. H. Lund, B. Webb, J. Hallam, Artif. Life 4, 95–107 (1998).25. Y. Nakamura, K. Yamane, Y. Fujita, I. Suzuki, IEEE Trans. Robot. 21, 58–66 (2005).26. M. Sreenivasa, A. Murai, Y. Nakamura, Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE/Robotics Society of Japan, Tokyo, Japan, November 2013), pp. 329–334.27. Y. Yamada et al., Sci. Rep. 6, 27893 (2016).28. Y. Yamada, K. Fujii, Y. Kuniyoshi, in IEEE Third Joint International Conference on Development and Learning

and Epigenetic Robotics (ICDL) (IEEE, Osaka, Japan, August 2013), pp. 1–7.29. C. Richter et al., IEEE Robot. Autom. Mag. (2016), doi: 10.1109/ MRA.2016.2535081.30. S. B. Furber, F. Galluppi, S. Temple, L. A. Plana, Proc. IEEE 102, 652–665 (2014).31. C. Eliasmith et al., Science 338, 1202–1205 (2012).32. H. Markram et al., Cell 163, 456–492 (2015).33. F. Walter, F. Röhrbein, A. Knoll, Neural Process. Lett. 44, 103–124 (2016).34. S. Kim, C. Laschi, B. Trimmer, Trends Biotechnol. 31, 287–294 (2013).35. J. Schemmel et al., Proceedings of the 2010 IEEE International Symposium on Circuits and Systems (ISCAS 2010), (IEEE, Paris, France, May–June 2010), pp. 1947–1950.36. Blender Foundation, website (2016); available at https://www. blender.org.37. A. Roennau, G. Heppner, L. Pfotzer, R. Dillmann, in Nature- Inspired Mobile Robotics, K. J. Waldron, M. O. Tokhi, G. S. Virk, Eds. (World Scientific, Hackensack, NJ, 2013), pp. 563–570.38. Clearpath Robotics, “Husky Unmanned Ground Vehicle” (2016); available at http://www.clearpathrobotics.com/husky- unmanned-ground-vehicle-robot.39. Open Source Robotics Foundation, “Gazebo: Robot Simulation Made Easy” (2016); available at http://www. gazebosim.org.40. M.-O. Gewaltig, M. Diesmann, Scholarpedia 2, 1430 (2007).41. V. Braitenberg, Vehicles, Experiments in Synthetic Psychology (MIT Press, Cambridge, MA, 1986).42. P. Martínez-Cañada, C. Morillas, B. Pino, F. Pelayo, in Artificial Computation in Biology and Medicine: International Work- Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2015, Elche, Spain, June 1–5, 2015, Proceedings, Part I, J. M. Ferrández Vicente, J. R. Álvarez- Sánchez, F. de la Paz López, F. J. Toledo-Moreo, H. Adeli, Eds. (Springer International Publishing, Cham, 2015), pp. 47–57.43. Myorobotics, website (2016); available at http:// myorobotics.eu.44. C. Denk et al., in Artificial Neural Networks and Machine Learning—ICANN 2013: 23rd International Conference on Artificial Neural Networks Sofia, Bulgaria, September 10–13, 2013, Proceedings, V. Mladenov et al., Eds. (Springer, Berlin, Heidelberg, 2013), pp. 467–474.45. N. R. Luque, J. A. Garrido, R. R. Carrillo, O. J. M. D. Coenen, E. Ros, IEEE Trans. Neural Netw. 22, 1321–1328 (2011).46. G. Hinkel et al., Proceedings of the 2015 Joint MORSE/VAO Workshop on Model-Driven Robot Software Engineering and View-Based Software-Engineering (ACM, L’Acquila, Italy, July 2015), pp. 9–15.

AcknowledgmentsThis paper summarizes the work of many people in HBP, in par-ticular the team of SP10 Neurorobotics, whose great contribution we would like to acknowledge. We also thank Florian Walter for his support in preparing this manuscript. Finally, we thank Csaba Erö for kindly providing Figures 3, 5, and 6. The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement 604102 (HBP).

BIOLOGICALLY INSPIRED MODELS FOR VISUAL COGNITION AND MOTION CONTROL: AN EXPLORATION OF BRAIN-INSPIRED INTELLIGENT ROBOTICS 35

Biologically inspired models for visual cognition and motion control: An exploration of brain-inspired intelligent robotics

Hong Qiao1, 2*, Wei Wu1, 3, and Peijie Yin4

The development of robots with human character-istics that can assist and even befriend humans has been a longstanding dream of many researchers in the field of robotics. However, whereas significant progress has been made in robotics in recent years, the flexibility, intelligence, and empathic ability of robots have not lived up to ex-pectations. Here we review ways in which current robotics research is improving these human-like characteristics.

There are two main approaches for increasing human and even “friend-like” qualities in robots. The first is func-tion- and performance-oriented. Based on observations and analyses of human behavior, models can be built to incorporate human-like cognition and decision-making into robotic systems. Up to this point, many biologically inspired models have been developed and applied to robotic vi-sion, audition, and motion control. The second approach is mechanism- and structure-oriented. Based on recent findings in biology and brain science, robots have been developed to mimic neural mechanisms and biological structures. The integration of neuroscience and information science is a promising direction for both theoretical and practical research in robotics, and paves the way for the proposed brain-inspired intelligent robot.

These two main research directions are illustrated in Figure 1. In the function- and performance-oriented approach, researchers have focused on the input–output relationship in humans, treating it as a “black box.” In contrast, in the mecha-nism- and structure-oriented approach, brain-inspired intel-ligent robotics is developed to mimic, apply, and integrate internal mechanisms existing in humans. In this way, the black box can be clarified in a stepwise fashion. Our findings on neural mechanisms and biological structures in humans can be applied to robots to improve their perception, cognition, learning, and motion control. Furthermore, since brain-inspired intelligent robots implement internal neural mecha-nisms such as mood and emotion, they may have the poten-tial to acquire empathy, so as to interact and cooperate with humans in a close and intimate manner (1). Such an advance in robotics could provide a substantial benefit in the national defense, manufacturing, and service sectors.

Through discussions with neuroscientists beginning in 2009, we have sought to improve the framework of brain-inspired intelligent robotics and build specific models inspired by neural mechanisms that include memory and association, attention modulation, generalized study, and fast movement response. Combining this work with long-standing robotic vision and motion research, we have built a series of brain-inspired models that have improved upon the performance of current models. By introducing neural mechanisms for memory and association, attention, and gen-eralized learning into the models, we have developed robots that exhibit robust recognition in complex environments and the ability to generalize (1-3).

Visual models Over the years a number of different biologically inspired

visual computational models of cognition have been pro-posed (4-6). The design of these models is based on related biological mechanisms and structures, and provides new solutions for visual recognition tasks. Here, we summarize our previous work on biologically inspired visual models.

We first introduced memory and association in the HMAX visual computational model (Figure 2). Objects were divided into two groups depending on whether they had episodic or semantic features. Here, episodic features refer to special image patches that correspond to the detailed features of the object, such as the key components (eyes, ears, nose, mouth) of a human face. Semantic features are those parts of an object that have a clear physical meaning, for example, whether or not the eyes or mouth on a face are big or small. Similar features found

on different objects were saved aggregately, which aided in association and recognition. Ultimately, the model achieved recognition with familiarity discrimination (recognizing episodic features) and recollective matching (retrieval of memory of features seen in the past and comparing them

1State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China2Chinese Academy of Sciences Center for Excellence in Brain Science and Intelligence Technology (CEBSIT), Shanghai, China3Huizhou Advanced Manufacturing Technology Research Center, Guangdong, China 4Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China*Corresponding Author: [email protected]

FIGURE 1. Brain-inspired intelligent robotics is emerging from the exploration of human behavior and biology. Approach A, function- and performance-oriented approach; Approach B, mechanism- and structure-oriented approach.

BIOLOGICALLY INSPIRED MODELS FOR VISUAL COGNITION AND MOTION CONTROL: AN EXPLORATION OF BRAIN-INSPIRED INTELLIGENT ROBOTICS 3736 BRAIN-INSPIRED INTELLIGENT ROBOTICS: THE INTERSECTION OF ROBOTICS AND NEUROSCIENCE

with features on the object being viewed). We have modified the model to provide better efficiency with a lower memory requirement and the ability to achieve recognition with faster speeds (1).

We further set out to improve the HMAX model so that

it could achieve active cognition adjustment for occlusion and orientation of the object, as well as combine episodic and semantic features of the same object in a loop-discharge manner, meaning that one pair of episodic and semantic features could trigger other pairs of features from

FIGURE 2. Introducing memory and association into the recognition model. Our model introduces biologically inspired memory and association mechanisms into HMAX, which consists of five parts: (1) Objects are memorized through episodic and semantic features (Block 1). Oi represents the ith object (i = 1... n), aji represents the jth semantic attribute (j = 1... m) of the ith object, and sxi represents the xth special episodic patch of the ith object. In face-recognition processing, an example of a semantic description might be that the eyes or mouth are large. A special episodic patch could be the image of an eye if the eyes are prominent on the face. (2) Features of one object are saved in distributed memory and common features of various objects are stored aggregately (Block 2). A1 to Am represent various common semantic attributes of different objects. Sx is a library for special episodic patches of different objects. A common attribute of different objects is stored in the same regions, for example, a11−a1n is the first feature of different objects that are stored together, called A1. (3) Recognition of one candidate (Block 3). Tt represents a candidate. The semantic features a1t to amt and the episodic patch feature sxt are extracted before recognition. (4) Familiarity Discrimination (Block 4). Familiarity discrimination is achieved through the HMAX model. Briefly, S1 and S2 correspond to simple cells in the primate visual cortex that carry out the filtering process, while C1 and C2 are complex cells that are the pooling layers of the corresponding S layers. Both sxi and sxt are extracted in the C1 layer of the HMAX model. The saved object can be ordered by comparing the similarity between the candidate and the saved objects in the C2 layer of the HMAX model. (5) Recollective Matching (Block 5). Recollective matching is achieved through integration of the semantic and episodic feature similarity analysis. For more information, see (1).

FIGURE 3. Introducing preliminary cognition and active attention into the recognition model. Unlike our previous work, here we propose a new formation of memory and association in Blocks 2 and 3. Specifically, the episodic and semantic features from the same object are saved together in Block 2. Features (fji) of the ith object include episodic features (pji, j = 1... m) and semantic features (sji, j = 1... m), and different features corresponding to the same component of an object are saved together. There are m distributed regions (Fj, j = 1... m) storing various common features of different objects. In Block 3, features are memorized in a

distributed structure, and recognition can be achieved through loop discharge in the association process. Additionally, we mimic the structure and function of the inferior temporal cortex layer (Block 5) and add them to the HMAX model (Block 4) to achieve preliminary cognition and active attention adjustment. For more information, see (2).

BIOLOGICALLY INSPIRED MODELS FOR VISUAL COGNITION AND MOTION CONTROL: AN EXPLORATION OF BRAIN-INSPIRED INTELLIGENT ROBOTICS 37

the same object, which were previously stored in the mem-ory (Figure 3). For example, seeing the eyes of one person could invoke retrieval of a memory of the whole face. These improvements were implemented into the model based on relevant biological evidence, such as primates’ ability to adjust the orientation of an object in the anterior inferior temporal cortex, and their multitask processing ability in the posterior inferior temporal cortex. By achieving robust recognition with orientation and occlusion of the object, the applicable scenarios of robotic cognition have been extended (2). Such improvements can provide the basis for developing personalized services performed by robots.

We also performed simulations of the spontaneous and

dynamic cognition process that occurs in human babies. We combined unsupervised feature learning with convolutional deep belief networks (CDBNs) to enable key components to be learned automatically without any prior knowledge, form the general knowledge of a class of objects, and achieve higher-level cognition and dynamic updating (Figure 4). This capacity could improve the ability of robots to learn on their own and to perform inductive generalization.

In order to mimic the neural mechanisms of human visual processing and incorporate them into our model, we modi-fied CDBNs to extract semantic features of the object, form an integrated concept, and reselect features when the input is ambiguous (3). These improvements promise to diminish

FIGURE 4. Introducing active and dynamic learning into the recognition model. The new model for the cognition of a special class of objects is given in panel A (framework for the learning process that corresponds to the training process) and B (framework for the classification process that corresponds to the testing process). The model consists of five parts: (1) Block 1 includes the input images. (2) Block 2 represents a deep neural network (DNN) model that is used for unsupervised key components learning. DNN includes a five-layer (I, H(1), P(1), H(2), P(2)) convolutional deep belief network (CDBN) and an upper feature layer MO-P2, where I is the input image of CDBN, H(1) and H(2) are two hidden layers, and P(1) and P(2) are two pooling layers. The last layer, MO-P2, is computed using a max-out operation on P(2). (3) Block 3 locates key components. Based on the outputs of the MO-P2 layer in CDBN, the spatial relation template of the key components is computed, and a support vector machine (SVM) classifier is trained to detect their positions. (4) Block 4 carries out contour detection and defines semantic geometric features based on this functionality. (5) Block 5 computes average semantic values as general knowledge and assessed semantic similarity between new samples and general knowledge. (6) Block 6 achieves final classification by combining episodic and semantic cognition. The subclass semantic description can help in achieving a higher-level cognition and is meaningful in associative recognition. (7) Block 7 adjusts the subclass descriptions of each sample according to the updated features. In summary, Block 2 simulates the mechanisms and functions of a primate visual cortex, predominantly from the primary visual cortex (V1) to posterior inferior temporal cortex (PIT). Blocks 3, 4, and 5 introduce mechanisms from the anterior inferior temporal cortex (AIT) of primates, while Block 6 simulates the functions of the prefrontal cortex (PFC) in primates. For more information, see (3).


uncertainties inherent in semantics and improve general-ization, which will enable robots to achieve more robust recognition in a complex environment.

Motion modelsTo improve the motion of robots, we introduced the

concept of a “brain-cerebellum-spine-muscle” structure, together with a “central–peripheral” nervous system into the motion planning and movement control of robots. This addition allows us to better simulate human-like multi-input and multioutput movements and to build a neural encod-ing–decoding model for the generation of movement con-trol signals. Through these advances we hope to improve the precision of movement and learning ability of robots, without sacrificing response speed (4).

In order to test these new concepts, we created bio-logically inspired human-like vision and motion platforms (Figure 5). The vision platform uses memory and associa-tion, active attention, and preliminary recognition based on human behavior to achieve robust robotic recognition of semantic features. The motion platform mimics structures and mechanisms in the central and peripheral nervous systems of humans to achieve fast responses, highly precise movements, and the ability to learn (4).

Conclusions and future directionsIn conclusion, we have built sophisticated robotic cogni-

tion and control models based on neural mechanisms. These models have relied on investigations into the subtle differences and relationships between target tasks and related brain neural network mechanisms. What results from the implementation of these mechanisms is a highly abstracted version of their human counterparts. Our objec-tive is to better understand the possible ways in which the human brain manages tasks, and to carefully select and abstract the appropriate mechanisms to make our models computationally realistic. The proposed models would then be consistent with actual biological mechanisms, but also applicable to real-world robotic environments.

In the future, we will continue to collaborate closely with

neuroscientists to explore brain mechanisms that will further illuminate missing features of the black box model. As part of our interdisciplinary research in the Center for Excellence in Brain Science and Intelligence Technology at the Chinese Academy of Sciences (CAS), we have entered into close col-laboration with neuroscientists from the CAS Institute of Neu-roscience to integrate neuroscience and information science. Specifically, we are seeking to combine our current theoreti-cal models with more sophisticated neural mechanisms of perception, cognition, learning, and motion.

For example, we plan to build a model inspired by hu-man neural mechanisms that will mimic human perception, motion, decision-making, emotion, energy optimization, and interaction with other humans (Figure 6). The model will be developed in a systematic manner and will be based on the coordination between the central and peripheral nervous systems. Applying such novel models in brain-inspired intel-ligent robots will allow us to both improve the robustness and intelligence of robotic systems and verify their effectiveness in robotic design. In another, complementary research project, we are establishing brain-inspired intelligent robotic plat-forms for use in brain-science research.

References 1. H. Qiao, Y. L. Li, T. Tang, P. Wang, IEEE Trans. Cybern. 44, 1485– 1496 (2014). 2. H. Qiao, X. Y. Xi, Y. L. Li, W. Wu, F. F. Li, IEEE Trans. Cybern. 45, 2612–2624 (2015). 3. H. Qiao, Y. L. Li, F. F. Li, X. Y. Xi, W. Wu, IEEE Trans. Cybern. 46, 2335–2347 (2016). 4. K. Fukushima, Neural Netw. 1, 119–130 (1988). 5. L. Itti, C. Koch, E. Niebur, IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259 (1998). 6. T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, T. Poggio, IEEE Trans. Pattern Anal. Mach. Intell. 29, 411–426 (2007). 7. H. Qiao, C. Li, P. J. Yin, W. Wu, Z.-Y. Liu, Assembly Autom. 36, 97– 107 (2016).

AcknowledgmentsThis research was partially supported by the National Natural Science Foundation of China (61210009) and the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB02080003).

FIGURE 5. Biologically inspired human-like vision (A) and motion (B). FIGURE 6. Ideal capabilities of brain-inspired intelligent robots.

ANTHROPOMORPHIC ACTION IN ROBOTICS 39

Anthropomorphic action in robotics

Jean-Paul Laumond

The word “robot” was first coined in the early 20th century, and the seminal ideas of cybernetics first appeared during World War II. The birth of robotics is generally pinpointed to 1961, with the introduction of the first industrial robot on the General Motors assembly line: The Unimate robot was patented by George Devol and industrialized by Joseph Engelberger, who is recognized as the founding father of robotics. From its beginning in the 1960s to its broad application in the automotive industry by the end of the 1970s, the field of robotics has been viewed as a way to improve production in manu-facturing by providing a manipulator integrated into a well-structured environment. Stimulated by programs such as space exploration, the 1980s saw the creation of field robotics, where a robot’s environment was no longer closed. However, even then the robot remained in isola-tion, only interacting with a static world.

At the end of the 1990s, robotics began to be promot-ed within the service industry, which led to the develop-ment of simple wheeled mobile platforms capable of performing tasks such as cleaning or automated transpor-tation. In the next stage of development, arms were added to the platforms, which allowed more expressive commu-nication. This generation of robots, including most recently “Pepper the robot,” is able to enter into dialogue with hu-mans—to welcome and guide them into public areas such as supermarkets, train stations, or airports. Incorporating arms into this type of robotic design has also allowed for object manipulation and physical interaction. However, the limitations of these wheeled mobile robots have become quite evident, sparking a quest for more anthropomorphic robots that can maneuver within a human environment, performing actions such as using stairs and moving over small obstacles. These features would allow for their mo-bility on any terrain; moreover, they would incorporate the capacity to perform dexterous manipulation. Developing such “humanoid” robots has become a major challenge in the field of robotics and, if fully realized, these devices will become the paragons of robotics science. Here I review the current status of anthropomorphic robots and how the fields of robotics, neuroscience, and biomechanics are coalescing to drive robotic innovation forward.

Anthropomorphic action Whereas the ultimate goal of roboticists is to provide

humanoid robots with autonomy, life scientists are striving

to gain an understanding of the foundations of human ac-tion, in domains ranging from medicine and rehabilitation to ergonomics. Nevertheless, neuroscience and its quest to un-derstand the computational foundation of the brain provide a further entry point to robotics. And despite their different scientific cultures and backgrounds, the communities of life scientists and roboticists are pursuing converging objectives.

Human beings and humanoid robots share a common anthropomorphic shape. A key to understanding anthropo-morphic action that can bridge robotics and life sciences is gaining insight into the fundamental mechanisms of the human body. As an example, consider the actions in Figure 1, performed by the humanoid robot HRP-2 at the Labora-tory for Analysis of Architecture and Systems of the French National Center for Scientific Research (LAAS-CNRS). In the first scenario, the robot answers a single order: “Give me the purple ball” (1). To accomplish the assigned objective, HRP-2 decomposes its task into elementary subtasks (Scenario A in Figure 1). A dedicated software module addresses each subtask. For instance, to reach the ball, the robot has to walk to the ball. “Walking” can be regarded as an elementary ac-tion that is a resource to solve the problem, and is processed by a dedicated locomotion module. In the second scenario (Scenario B in Figure 1), HRP-2 has to grasp a ball that is located between its feet (2). To accomplish this objective, the robot has to first step away from the ball and then grasp it. In this scenario, the significance of “stepping away” becomes a vital issue. In this experiment, there is no dedicated module in charge of “stepping,” which is a direct consequence of the requirement for “grasping.” Thus, no stepping “symbol” appears as a resource for problem solving in Scenario B. The grasping action is embedded in the robot’s body, allowing its legs to naturally contribute to the action. No deliberative reasoning is required for the robot to face complicated situa-tions such as picking up a ball between the feet.

To design a robot capable of the embedded actions required to execute Scenario B, we must first imagine replac-ing the humanoid robot HRP-2 with a human being. Among all the possible motions required for grasping an object, we must consider the underlying principle for selection of a particular motion in humans. How do humans organize their behaviors to reach a given objective? Where within the brain does this reasoning take place? What are the relative contributions in Scenario A and B of voluntary actions com-puted in the frontal cortex to reflexive actions computed by spinal reflexes? How and why are different actions computed by different mechanisms? What musculoskeletal synergies are required to simplify control of complex motions? Such questions lie at the core of current research in computational neuroscience and biomechanics. In the remainder of this re-view, I will briefly discuss three viewpoints on anthropomor-phic action from a robotics, neuroscience, and biomechanics perspective. I will also comment on mathematical methods for anthropomorphic action modeling.

A robotics perspective In the quest for robot autonomy, R&D in robotics has

been stimulated by competition between computer sci-ence and control theory, and between abstract symbol

Laboratory for Analysis and Architecture of Systems-Centre National de la Recherche Scientifique (LAAS-CNRS), Toulouse University, Toulouse, FranceCorresponding Author: [email protected]


manipulation and physical signal processing, with the goal of embedding discrete data structures and continuous variables into a single architecture. This architecture is a way of decomposing complicated intelligent behavior into elementary modules, or “symbols,” capable of execut-ing a well-defined function. Designing robot architecture requires the well-designed “placing” of these symbols.

In the field of robotics, centralized architectures were first designed in manufacturing. In this hierarchical para-digm, the robot operates in a top-down fashion, combining predefined specialized functions for perception, decision, and control. Such architectures perform well in structured environments where a finite-state machine can describe the world of possible actions, as is the case in production engi-neering (3). Other architectures promote a bottom-up view, a seminal approach introduced by Rodney A. Brooks (4). Us-ing the concept of subsumption, he proposed a reactive ro-bot architecture organized by integrating low-level, sensory-motor loops into a hierarchical structure. In this paradigm, a behavior is decomposed into subbehaviors organized in a hierarchy of layers. Higher levels subsume lower levels ac-cording to the context. This research gave rise to the school of so-called “bioinspired” robotics, which emphasized mechanism design and control (5) and related schools in ar-tificial intelligence, including multiagent systems (6), swarm robotics (7), and developmental robotics (8). Other types of architectures have tended to combine top-down and bottom-up views in a hybrid manner, integrating delibera-tive reasoning and reactive behaviors (9, 10).

The aim of all these approaches is to provide a generic solution for mobile robots as well as articulated me-chanical systems, and to enable the robotic design to be independent of the mechanical dimensions of the system. Further developments in imposing anthropomorphic body considerations on humanoid robot architectures will in-volve the promotion of “morphological computation,” with its emphasis on the role of the body in cognition (11).

A computational neuroscience perspectiveHow to represent action is a key issue today in human

science research, a field that encompasses endeavors ranging from neuroscience to the philosophy of mind (12). The subject itself, which ponders such questions as whether there can in fact be “representation” of action, has long been controversial. However, the discovery of mirror neurons by Rizzolatti (13) provided physiological evidence to support the concept of action representation, which was promoted by philosopher Edmund Husserl at the end of the 19th century (14).

In terms of motor control, the pioneering work of Nikolai Bernstein in the 1960s revealed the existence of mo-tor synergies (15). The work of Bizzi and colleagues then provided biological evidence of this concept (16). Since then, numerous researchers have pushed the borders of their disciplines to discover laws and principles underlying human motion, which has in turn established the fundamen-tal building blocks of complex movements (17–19). More recently, Alain Berthoz introduced the word “simplexity” to synthesize all these works into a single concept: To face the complexity of having such high dimensions in motor control space, living beings have created laws that link motor con-trol variables and hence reduce computations (20).

A biomechanics perspectiveIn the 19th century, Étienne-Jules Marey introduced

chronophotography to scientifically investigate locomotion, and was the first scientist to correlate ground reaction forces with kinetics. The value of jointly considering biomechan-ics, modern mathematics, and robotics is illustrated by the famous “falling cat” case study: Why is it that a falling cat always lands on its feet? The answer comes from the law of conservation of angular momentum: The cat can be mod-eled as a nonholonomic system1 whereby geometric control techniques perfectly explain the phenomenon (21). Thus, biomechanics provides models of motion generation (22),

FIGURE 1. An introductory example of embodied intelligence. Top panels. Scenario (A): The global task “Give me the ball” is decomposed into a sequence of subtasks— [locate the ball], [walk to the ball], [grasp the ball], [locate the operator], [walk to the operator], and [give the ball]. The motions [walk to], [grasp], [give] appear as symbols of the decisional process that decomposes the task into subtasks. Bottom panels. Scenario (B): To grasp the ball between its feet, the robot has to step away from the ball. In this experiment, “stepping away” is not a software module, nor a symbol. It is an integral part of the embodied action “grasping.” The action in Scenario A is well segmented. The action in Scenario B is not: Unlike the command “walk to,” “stepping away” does not constitute a symbol.

ANTHROPOMORPHIC ACTION IN ROBOTICS 41

which have subsequently been applied in ergonomics (23) and studies of athletic performance (24).

Mathematical methods for anthropomorphic action modeling

From a mechanistic point of view, the human (or humanoid) body is both a redundant system and an underactuated one. It is redundant because its number of degrees of freedom is usually much greater than the di-mension of the tasks to be performed. It is underactuated because there is no direct actuator allowing the body to move from one place to another place: To do so, the hu-man must use its internal degrees of freedom and actuate all his limbs following a periodic process, namely bipedal locomotion. Actions take place in the physical place, while they originate in the sensory-motor space. Thus geometry is the core abstraction linking three fundamental action spaces: the physical space where the action is expressed, the motor space, and the sensory space. The emergence of symbols can be understood by the geometric structure of the system configuration space. Such a structure de-pends on the role of the sensors in action generation and control. As an example, in a recent study, we highlighted the role of the gaze to explain the geometric shape of hu-man locomotor trajectories (26).

Whereas an action, such as “walk to” or “grasp” is defined in the real world, it originates in the control space. The relationship between “action in the real world” and “motion generation” in the motor control space is defined in terms of differential geometry, linear algebra, and optimality principles (18, 27). Optimal control is based on well-established mathematical machinery ranging from the analytical approaches initiated by Pontryagin (28) to the recent developments in numerical analysis (29). It allows for motion segmentation as well as motion genera-tion. On the other hand, inverse optimal control is a way to model human motion in terms of controlled systems. Specifically, if given an underlying hypothesis of a system, as well as a set of observed natural actions recorded from an experimental protocol performed on several partici-pants, optimization acts to determine the cost function of the system. From a mathematical point of view, the inverse problem is much more challenging than the direct one. Recent studies have been published in this area utilizing numerical analysis (30), statistical analysis (31), and ma-chine learning (32, 33).

Movement is a distinctive attribute of living systems. It is the source of action. Robots are computer-controlled machines endowed with movement ability. Whereas living systems move to survive, robots move to perform actions defined by humans. Exploring the computational foundations of human action therefore provides a promis-ing route to better engineering humanoid robots. As a movement science, geometry offers the suitable abstrac-tion allowing for fruitful dialog and mutual understanding between roboticists and life scientists.

References 1. E. Yoshida et al., Comput. Animat. Virtual Worlds 20, 511–522 (2009). 2. O. Kanoun, J.-P. Laumond, E. Yoshida, Int. J. Robot. Res. 30, 476–485 (2011). 3. A. Jones, C. McLean, J. Manuf. Syst. 5, 15–25 (1986). 4. R. Brooks, Artif. Intell. 47, 139–159 (1991). 5. S. Hirose, Biologically Inspired Robots: Snake-Like Locomotors and Manipulators (Oxford University Press, Oxford, 1993). 6. Y. Shoham, K. Leyton-Brown, Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations (Cambridge University Press, Cambridge, 2008). 7. E. Bonabeau, M. Dorigo, G. Theraulaz, Swarm Intelligence: From Natural to Artificial Systems (Oxford University Press, Oxford, 1999). 8. R. Arkin, T. Balch, J. Exp. Theor. Artif. Intell. 9, 175–189 (1997). 9. R. Alami, R. Chatila, S. Fleury, M. Ghallab, F. Ingrand, Int. J. Robot. Res. 17, 315–337 (1998).10. M. Asada et al., IEEE Trans. Auton. Ment. Dev. 1, 12–34 (2009).11. R. Pfeifer, J. Bongard, How the Body Shapes the Way We Think: A New View of Intelligence (MIT Press, Cambridge, 2007).12. M. Jeannerod, The Cognitive Neuroscience of Action (Wiley- Blackwell, Hoboken, NJ, 1997).13. G. Rizzolatti et al., Cogn. Brain Res. 3, 131–141 (1996).14. E. Husserl, Ideas: General Introduction to Pure Phenomenology (Macmillan, New York, 1931).15. N. Bernstein, The Co-ordination and Regulation of Movements (Pergamon Press, Oxford, 1967).16. E. Bizzi, F. A. Mussa-Ivaldi, S. Giszter, Science 253, 287–291 (1991).17. M. J. E. Richardson, T. Flash, J. Neurosci. 22, 8201–8211 (2002).18. E. Todorov, Nat. Neurosci. 7, 907–915 (2004).19. K. Mombaur, A. Truong, J. P. Laumond, Auton. Robot. 28, 369–383 (2010).20. A. Berthoz, Simplexity: Simplifying Principles for a Complex World (Yale University Press, New Haven, 2012).21. Z. Li and J. F. Canny, Eds., Nonholonomic Motion Planning (Kluwer Academic Publishers, Dordrecht, 1993).22. A. Chapman, Biomechanical Analysis of Fundamental Human Movements (Human Kinetics, Champaign, IL, 2008).23. S. Kumar, Ed., Biomechanics in Ergonomics (CRC Press, Taylor & Francis, London, 2008).24. M. R. Yeadon, in Biomechanics in Sport: Performance Enhancement and Injury Prevention, V. M. Zatsiorsky, Ed., Encyclopaedia of Sports Medicine, Vol. IX (Blackwell Science Ltd., Oxford, 2000), pp. 273–283.25. H. Poincaré, Revue de Métaphysique et de Morale 3, 631–646 (1895).26. M. Sreenivasa, K. Mombaur, J.-P. Laumond, PLOS ONE 10, e0121714 (2015).27. J.-P. Laumond, N. Mansard, J. B. Lasserre, Commun. ACM 58, 64–74 (2015).28. L. Pontryagin et al., The Mathematical Theory of Optimal Processes (Wiley, Interscience Publishers, New York, 1962).29. J. F. Bonnans, J. C. Gilbert, C. Lemaréchal, C. A. Sagastizábal, Numerical Optimization: Theoretical and Practical Aspects, (Springer, Berlin, Heidelberg, 2006).30. M. Diehl, K. Mombaur, Eds., Fast Motions in Biomechanics and Robotics, Lecture Notes in Control and Information Sciences 340 (Springer, Berlin, Heidelberg, 2006).31. T. Inamura, Y. Nakamura, I. Toshima, Int. J. Robot. Res. 23, 363–377 (2004).32. T. Mitchell, Machine Learning (McGraw Hill, New York, 1997). 33. J. Kober, J. Bagnell, J. Peters, Int. J. Robot. Res. 32, 1238–1274 (2013).

1A nonholonomic system is a system whose reachable space dimensions are greater than the number of its motors.


longitudinal control of the UGV. The near-optimal policy was obtained offline by the KLSPI algorithm using samples collected from the control system. Furthermore, we also designed an online actor–critic learning controller de-veloped from kernel-based dual heuristic programming (KDHP) (11) to realize self-learning lateral control. We conducted a variety of field experiments to test the perfor-mance of the self-learning and self-optimization capabili-ties of the brain-like actor–critic learning controllers. The experimental results indicate that for UGVs, actor–critic learning control not only autonomously controls UGVs, but also improves their driving skills through self-learning mechanisms.

Reinforcement learning in roboticsAs a class of wheeled mobile robots, UGVs have been

widely studied for their potential to improve intelligent transportation systems, automobile safety, and space exploration. Yet despite advances in areas related to their development, the application of UGVs to real environmen-tal situations has not been realized to its full potential. From the perspective of motion control, the controller design of UGVs has to deal with uncertainties in vehicle dynamics, varying road conditions, and constraints in kinematics such as the minimum turning radius of a vehicle. To overcome these difficulties, various advanced control approaches have been studied (12–15), with convergence and stability analyses among the most typical dynamic control methods. Yet even with these sophisticated methods, challenging problems remain for UGV control. For example, designing optimal or near-optimal motion controllers for UGVs where there are uncertainties and disturbances remains a serious challenge (6, 7). Thus, it will be critically important to devel-op UGVs capable of self-learning such that the driving skills of the autonomous control system continually improve dur-ing their interactions with uncertain environments.

As a major class of machine learning methods, rein-forcement learning (RL) (16, 17) has been studied in the robotics field in both simulated and real-world scenarios (18, 19). RL is distinct from supervised learning and math-ematical programming methods and has been shown to be an efficient way to solve sequential decision problems modeled as MDPs, where some modeling information is unknown. Thus, compared with dynamic programming, RL is better suited for solving sequential optimization and control problems with inherent uncertainties. However, one of the main problems in RL is the “curse of dimensional-ity,” which means that computational and storage costs increase exponentially with the number of state dimen-sions. To solve this problem, approximate RL methods, also called “approximate dynamic programming” or “adaptive dynamic programming” (ADP), have received more and more attention in recent years (20, 21). A variety of approxi-mate RL methods have been developed that fall mainly into three categories: value function approximation (VFA), policy search, and actor–critic methods (21, 22). VFA aims to estimate the value function of an MDP using function ap-proximators such as neural networks. In the framework of VFA, successful results have been obtained by making use

Actor–critic reinforcement learning for autonomous control of unmanned ground vehicles

Xin Xu*, Chuanqiang Lian, Jian Wang, Han-gen He, and Dewen Hu*

U nmanned ground vehicles (UGVs) are receiving substantial attention for their potential utility in a variety of domains such as the automobile industry, transportation systems, and smart manufacturing (1–3). However, attain-ing autonomous control of UGVs in complex environments remains a significant challenge. In particular, self-learning abilities are needed in the skill and performance optimi-zation of autonomous driving systems. In this article, we will introduce some recent work in an area of study called “reinforcement learning” for autonomous control of UGVs (4–6). These types of self-learning methodologies are inspired by the so-called “actor–critic” learning mechanism of the human brain, which can improve the effectiveness of actions by receiving delayed evaluative feedback from environmental interactions.

The UGV platform used for this study won the 2014 Challenge for Unmanned Ground Vehicles in China. Advanced sensing and information fusion theories and technologies have been developed for the UGV plat-form, including visual saliency detection, stereo vision for obstacle detection, and long-distance lane detection (7–9), among others. We introduce our research on developing the self-learning capabilities of the UGV (4, 5) below. In particular, this work involved analyzing the connections be-tween the actor–critic learning algorithms and the activity pattern of dopamine neurons in the brain, and developing two actor–critic learning control algorithms for the self-learning motion control of UGVs.

We have modeled the learning control problem as a Markov decision process (MDP), and the learning objec-tive is to find an optimal control policy that can minimize long-term cumulative rewards. We designed the rewards according to the performance measure of the learning control problems being analyzed, such as tracking errors. The optimal policy is a deterministic mapping or function from input states to control actions. In practice, due to the large state and action space of the underlying MDP, a near-optimal or suboptimal policy is usually obtained. In our study, by using a batch-mode actor–critic learning controller founded on kernel-based least-squares policy iteration (KLSPI) (10), a near-optimal policy is learned for

College of Mechatronics and Automation, National University of Defense Technology, Changsha, China*Corresponding Authors: [email protected] (X.X.), [email protected] (D.H.)



ACTOR–CRITIC REINFORCEMENT LEARNING FOR AUTONOMOUS CONTROL OF UNMANNED GROUND VEHICLES 43

of deep neural networks, such as the AlphaGo program (23). Policy search involves directly performing a search process in the policy space, using either heuristic informa-tion or gradient rules. Actor–critic methods can be viewed as a hybrid of VFA and policy search, where the critic is used for VFA and the actor is used for policy search. Actor–critic learning control has been shown to be important for human skill learning based on experience and evaluative feedback, especially in the case of improving driving skills. However, to our knowledge, little work has been done on actor–critic learning control methods in real-time field test-ing of UGVs.

Actor–critic learning in the human brainResearch from fields such as machine learning,

operations research, robotics, and cognitive science have contributed to a better understanding of the reinforcement learning mechanisms in the human brain (24–27). Various RL models have been established to describe the reinforcement-based cognitive process from the neuro-chemical level to higher system levels (28, 29).

In RL, the reward function plays an important role by in-dicating the objectives to be optimized. Many experimen-tal results have demonstrated that reward-directed behav-iors rely on the activity of dopamine (DA) neurons from the midbrain ventral tegmental area (VTA) and substantia nigra pars compacta (24, 25). It has been shown that DA not only implements synaptic transmission but also generates long-term effects, which are executed through DA’s modulation of synaptic plasticity at target structures that include the basal ganglia, prefrontal cortex, and amygdala (24).

The actor–critic model has been widely studied to understand how optimized action policies are generated from reinforced behaviors (24). In an actor–critic structure, the critic implements a form of temporal-difference (TD) learning that is closely related to the DA neuron activ-ity, whereas the actor learns an action policy that can optimize the long-term cumulative rewards (Figure 1). The learning problem is modeled as an MDP described as a tuple {S, A, P, R}, where S is the state space, A denotes the action space, P is the state transition probability, and

R is the reward function. By making use of the TD learn-ing algorithm (30–32), the critic estimates the state-value function Vp(s), which is the expected, discounted total reward when starting from s and following policy p there-after, where the expectation is with respect to the state transition probability and the action policy:

(1)

The activity pattern of DA neurons not only estimates the state-value function but also estimates the reward prediction error or the TD error:

(2)

where st+1 is the successive state of st, V(s) denotes the estimate of the value function V(s), and rt is the reward received after the state transition from st to st+1.

Based on the predicted value function in the critic, the actor updates the action policies either by using a so-called “greedy policy search” or by performing an update based on an approximated policy gradient. For the greedy policy search, the action of each state will be greedy with respect to the estimated value function, which means that the action with the minimum value function will be selected. For the approximated policy gradient, the action policy will be updated with an estimated gradient direc-tion for optimizing the policy performance, which can be measured by the estimated value function.

Designing an actor–critic learning controller for UGVs

In our recent work (5), a self-learning cruise controller was developed for longitudinal control of a UGV, as de-picted in Figure 2. The self-learning controller integrates the KLSPI algorithm (10, 33) with a proportional–integral (PI) controller with adjustable PI parameters. The output of the actor determines an optimized selection of the adjustable PI parameters. In Figure 2, KP and KI are the coefficients of the PI controller, vc and ac are the cur-rent speed and acceleration of the vehicle, vd is the desired speed, Δv is the difference between vc and vd, and u denotes the throttle and brake input. Velocity and acceleration profiles vt and at were designed to specify the desired longitudinal motion property. After generat-ing the learning target, a reward function is defined to minimize the velocity tracking error after collecting the observation data.

In the self-learning cruise controller, the state of the underlying MDP is defined as a vector containing vc, ac, and Δv, and the action is the selected KP–KI vector. In the critic, the value function outputs the state-action value for every KP–KI vector in its current state, and the least-squares TD (LSTD) learning algorithm (32, 34) is used to evaluate the value function under a given policy. In order to realize generalization in large or continuous state spaces, function approximators are usually employed. Under a stationary policy pn at iteration n, when a state transition trajectory {s1, a1, r1, s2, a2, …sN, aN, rN} is observed, the following LSTD

FIGURE 1. Schematic of the actor–critic learning control structure. The system consists of a critic that predicts value functions and an actor that learns the action policies for optimizing long-term rewards. xt and rt denote the state of the Markov decision process and the reward at time step t, respectively; at is the control output at time step t; V(xt) and λ(xt) are the estimated value function or value function derivatives of state xt. is the predicted state at time t + 1.


The system states are defined as s = [ex, ey, eθ, ρ, v]T, where ν is the speed of the UGV; ex, ey, eθ are the track-ing errors in the x-direction, y-direction, and orientation, respectively; and ρ is the curvature of path point (cx, cy), which can be calculated by

(9)

update rule (34) is used to estimate the approximated action value function:

(3)

(4)

where φt =φt (st, at) is the feature vector, and !q is the weight vector.

In the actor, for each observed state, the policy improvement process computes a greedy action with the largest state-action value, and the selected action determines a corresponding KP–KI vector to update the coefficients of the PI controller:

(5)

For a deterministic policy set, there exists an optimal policy p*, which maximizes the action-value function Qp for all the state-action pairs:

(6)

If the optimal action value function Q*(s, a) is com-puted, the optimal policy can be obtained by

(7)

In KLSPI and other kernel-based RL algorithms, a set of regularized or sparsified kernel-based features !k are generated in a data-driven way and the value function is approximated in a linear form as . Then, the generalization ability of RL algorithms can be improved without much human intervention, as described previously in detail (10, 33).

In addition to the self-learning longitudinal controller, the lateral control problem was solved using another type of online actor–critic learning con-troller, namely KDHP (11). In contrast to KLSPI, KDHP is more suitable for MDPs with continuous action spaces. In KDHP, a critic network is used to approximate the derivative of the cost function, which has been shown to be more beneficial for reducing the estimation variance of the policy gradient in the actor.

Figure 3 depicts the architecture of the KDHP-based lateral controller. s(k) and s(k + 1) are the observed states at time steps k and k + 1. The actor generates the control u(s(k)) . The critic evaluates the control performance under the current control policy. Critic #1 is equivalent to critic #2, and together they are used to estimate the value func-tion derivatives at different time steps. The value function derivative is defined as:

(8)

FIGURE 3. Architecture of the KDHP-based lateral controller. See text for explanation of variables shown.

FIGURE 4. The UGV platform. The UGV is equipped with laser radars, cameras, and a GPS, as well as an inertial navigation system.

FIGURE 2. Block diagram of the self-learning cruise controller. See text for explanation of variables shown.

ACTOR–CRITIC REINFORCEMENT LEARNING FOR AUTONOMOUS CONTROL OF UNMANNED GROUND VEHICLES 45

The control signal is defined by u = tand / L - r and is subject to the following constraint:

(10)

where δ is the driving angle, L is the distance between the front and the rear axles, and aLmax is the maximum lateral acceleration. The reward function for evaluating the lateral control performance can then be defined by

(11)

where Pi (i = 1,2,3) and Rj (j = 1,2) are positive constants. In the critic of KLSPI and KDHP, a data dictionary Dn with a reduced number of observed data is constructed. !a = [a1,a2 ,...,aT ]

T is the coefficient vector for approximating λ(s) and !k (xt ) = (k(x1,xt ),k(x2 ,xt ),...,k(xT ,xt ))

T.The state-action value function or its derivative are

approximated as:

(12)

(13)

where d(n) is the length of the dictionary Dn, qj = (sj, aj), and sj(j = 1,2,…,d(n)) are the elements of the data dictionary. Based on the kernel-based features, the online learning rules in the critic of KDHP were designed by the recursive LSTD learning approach (32).

In the actor training of KDHP, the following online policy gradient method is used:

FIGURE 6. Spatial layout of the campus road. m, meters.

FIGURE 7. Lateral error curves of kernel-based dual heuristic programming (KDHP) and pure pursuit controllers (18). m, meters.

FIGURE 5. Re-sults before and after learning through the LSPI and KLSPI algo-rithms. Control-ler i (i =1, 2), in panels A and B, is the proportion-al–integral (PI) controller whose fixed proportion-al and integral coefficients together cor-respond to the ith action of the action set. Con-trollers 3 and 4 (panels C and D) are both learn-ing controllers with the same architecture. The former uses the policy obtained by least-squares policy iteration (LSPI) (33), while the latter em-ploys the policy obtained by the kernel-based LSPI (KLSPI) algorithm (5).

A

B

C

D


(14)

Experimental testing of a UGV using brain-like learning control

The UGV platform used for testing is shown in Figure 4. It weighs 2,550 kg and its length, width, and height are 5.08 m, 1.94 m and 1.9 m, respectively. It is equipped with a 5-speed automatic transmission. A customized drive-by-wire system was designed for the low-level actuator system, which realizes communications based on the controller area network (CAN) bus. The controller is implemented by a Motorola PowerPC architecture, which has a 450-MHz CPU and 512-MB error correction code (ECC) SDRAM memory (4, 5).

As discussed previously, the KLSPI algorithm realizes the policy-learning process in a batch mode, and a sample collection process is designed before learning commences (10). The sample collection process is performed in an urban traffic environment with a sampling time step of 0.05 seconds. For each episode, the maximum number of state transition steps is 200, giving a total of 10 seconds. The sample set used for KLSPI has 500 episodes of state-transition data.

Figure 5 shows the performance comparisons among different longitudinal controllers. In Figure 5, panels A and B depict the velocity-tracking performance of two manually optimized PI controllers, which represents performance prior to learning. Panels C and D show the velocity tracking performances of the LSPI-based learning controller and KLSPI-based learning controller, respectively. For the LSPI-based learning controller, some manually designed features were used (5). The data shows that the KLSPI-based controller achieves superior perfor-mance compared to manually optimized PI controllers and

the LSPI-based learning controller. Table 1 shows the quantitative evaluation of the four

controllers. The performance measure includes the aver-age absolute error and the standard deviation of speed tracking. It is shown that the KLSPI-based self-learning controller has the smallest tracking error (bolded).

Making use of the KDHP algorithm, we implemented an online learning control process for lateral control of the UGV platform (6). Figure 6 shows the configuration of the campus road used for the lateral control experiments. The lateral error curves of KDHP and a popularly used lateral control method called “pure pursuit control” are shown in Figure 7. The data shows that the KDHP controller has a smaller lateral error, especially from 0 to 40 seconds.

Table 2 shows the performance comparisons between these two controllers based on lateral control experi-ments on the campus road. The data indicates that, after learning, the KDHP controller has a smaller average lateral error and standard deviation (bolded). In addition, the KDHP controller has a smaller maximum lateral ac-celeration than pure pursuit control (bolded).

Based on the above results, we conclude that the actor–critic self-learning controller can effectively optimize the control performance of UGVs without much prior knowledge or human intervention. The proposed methodologies will be important and beneficial for improving the driving skills and performance of UGVs during their interaction with complex environments.

References 1. J. Ziegler et al., IEEE Trans Intell. Transp. Syst. 6, 8–20 (2014). 2. S. Thrun, S. Strohband, G. Bradski., J. Field Robot. 9, 661–692 (2004). 3. V. Milanés, J. Villagrá, J. Godoy, C. González, IEEE Trans. Control Syst. Technol. 20, 770–778 (2012). 4. J. Wang et al., J. Field Robot. 30, 102–128 (2013). 5. J. Wang, X. Xu, D. Liu, Z. Sun, Q. Chen, IEEE Trans. Control Syst. Technol. 22, 1078–1087 (2014). 6. C. Lian, Optimal Control Methods Based on Approximate Dynamic Programming with Applications to Unmanned Ground Vehicles, Ph.D. thesis (National University of Defense Technology, Changsha, China, 2016). 7. J. Li, M. D. Levine, X. An, X. Xu, H. He, IEEE Trans. Pattern Anal. Mach. Intell. 35, 996–1010 (2013). 8. T. Hu, B. Qi, T. Wu, X. Xe, H. He, Comput. Vis. Image Und. 116, 908–921 (2012). 9. X. Liu, X. Xu, B. Dai, J. Cent. South Univ. 19, 1454–1465 (2012).10. X. Xu, D. Hu, X. Lu, IEEE Trans. Neural Netw. 18, 973–992 (2007).11. X. Xu, Z. Hou, C. Lian, H. He, IEEE Trans. Neural Netw. Learn. Syst. 24, 762 –775 (2013).12. R. Rajamani, H.-S. Tan, B. K. Law, W.-B. Zhang, IEEE Trans. Control Syst. Technol. 8, 695–708 (2000).13. J.-J. Martinez, C. C. de Wit, IEEE Trans. Control Syst. Technol. 15, 246–258 (2007).14. B. J. Patz, Y. Papelis, R. Pillat, G. Stein, D. Harper, J. Field Robot. 25, 528–566 (2008).15. K. R. S. Kodagoda, W. S. Wijesoma, E. K. Teoh, IEEE Trans. Control Syst. Technol. 10, 112–120 (2002).16. R. S. Sutton, A. G. Barto, Reinforcement Learning: An

Controller PI-1 PI-2 LSPI KLSPI

Average absolute error (km/h)

1.8571 1.3040 1.2877 1.0494

Standard deviation (km/h)

2.3876 1.8943 2.4447 2.0648

TABLE 1. Performance comparisons among the four controllers (17).

TABLE 2. Performance comparisons between two control methods based on lateral control experiments on campus road (18). eis the average lateral error, S(e) is the standard deviation of lateral error, ā is the average lateral acceleration, and aM is the maximum lateral acceleration.

Controller (m) S(e)(m) ā(m/s2) aM (m/s2)

KDHP controller 0.0660 0.0757 0.1671 0.8737

Pure pursuit controller

0.1046 0.0810 0.1660 0.9771

e

COMPLIANT ROBOTIC MANIPULATION: A NEUROBIOLOGIC STRATEGY 47

Compliant robotic manipulation: A neurobiologic strategy

Hong Qiao1,2,3*, Chao Ma4, and Rui Li1,3

Robotics will be key to the success of automation technologies of the future, particularly in the areas of advanced manufacturing and services. The development of compliant robots, which are characterized by flexibility, suppleness, and freedom of movement, has become an area of intensive research. Although the concept of compliance has not been strictly defined, it implies that robots are capable of dexterous manipulation and the ability to sense and respond to interactions with humans with exquisite precision.

Achieving compliant design Compliant robotic manipulation can effectively reduce

dependency on sensory information while improving relevant operational capabilities (1-3). Yet designing effec-tive compliance strategies remains a challenging task. In recent years, researchers have begun to develop methods

and devices for applying compliance by leveraging infor-mation within the robot’s environment. In particular, the concept of “attractive region in environment” (ARIE) has achieved encouraging results.

ARIE is a type of constrained region in the configuration space of the robotic system, formed by its environment. In such regions, the state of the system will converge to a stable point with a state-independent input applied to the system. By way of example, imagine that there is a bowl and a bean in the physical space (Figure 1). If we drop the

1The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China2Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China3School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China4School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China*Corresponding Author: [email protected]

FIGURE 2. ARIE in the configuration space of a typical round-peg, round-hole insertion task. The color of the region varies from yellow to blue as the uncertainty of the system decreases.

FIGURE 1. An intuitive illustration of ARIE using the “bowl–bean” system in the physical space.

Introduction (MIT Press, Cambridge MA, 1998).17. C. Szepesvari, Algorithms for Reinforcement Learning (Morgan and Claypool, Williston, VT, 2010).18. F. Lewis, D. Vrabie, IEEE Trans. Circuits Syst. 9, 32–50 (2009).19. W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley, New York, 2007).20. L. Busoniu, R. Babuska, B. D. Schutter, D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators (CRC Press, Boca Raton, FL, 2010).21. D. Liu, Q. Wei, IEEE Trans. Neural Netw. Learn. Syst. 25, 621–634 (2014).22. X. Xu, L. Zuo, Z. H. Huang, Inf. Sci. (NY) 261, 1–31(2014).23. D. Silver et al., Nature 529, 484–489 (2016).24. R. D. Samson, M. J. Frank, J.-M. Fellous, Cogn. Neurodyn. 4, 91–105 (2010).25. J. T. Arsenault, K. Nelissen, B. Jarraya, W. Vanduffel, Neuron 77, 1174–1186 (2013).26. R. Chowdhury et al., Nat. Neurosci. 16, 648–653 (2013).27. R. E. Suri, W. Schultz, Neuroscience 91, 871–890 (1999).

28. W. Schultz, J. Neurophysiol. 80, 1–27 (1998).29. P. N. Tobler, C. D. Fiorillo, W. Schultz, Science 307, 1642–1645 (2005).30. R. S. Sutton, Mach. Learn. 3, 9–44 (1988).31. B. Seymour et al., Nature 429, 664–667 (2004).32. X. Xu, H. G. He, D. W. Hu. J. Artif. Intell. Res. 16, 259–292 (2002).33. M. G. Lagoudakis, R. Parr, J. Mach. Learn. Res. 4, 1107–1149 (2003).34. J. Boyan, Mach. Learn. 2–3, 233–246 (2002).

AcknowledgmentsThe authors would like to thank our colleagues in the unmanned vehicle laboratory of the National University of Defense Technology for preparation of the experimental UGV platform. This work was supported by the National Natural Science Foundation of China (NSFC) under grant 91420302, the Innovation and Development Joint Foundation between NSFC and the Chinese Automobile Industry under grant U1564214, and the Pre-research Project of the National University of Defense Technology.


limited sensing information, humans can plan and properly execute complex motions such as grasping. We posit that, by careful observations of how humans achieve these tasks, robots can be designed to execute a variety of complex movements in response to a range of object types.

Based on these observations and the concept of ARIE, we have proposed the adoption of a framework called “constrained region in environment” (CRIE) to achieve compliant robotic manipulation. This system would integrate constraints formed by the environment and coarse-sensing information. As illustrated in Figure 3, one of the distinguishing features of CRIE is the combination of offline planning and online dynamic compliant adjustments to compensate for uncertainties inherent in the ideal model. Thus, by using manipulation states acquired by coarse-sensing information, a compliant robot could determine whether an appropriate adjustment should be implemented to asymptotically track the predesigned planning sequence. In robotics, the implementation of various sensors and related technologies has grown in recent years, and offers boundless prospects in a wide range of manipulation tasks. The CRIE framework has great potential in engineering applications, including the challenging task of capturing information not provided directly by sensors. CRIE has been successful in providing more flexibility and efficiency for robots in executing grasping and other manipulation tasks; however, more effort in control and planning strategies for robotic manipulation is needed to achieve a higher degree of compliance.

Integrating neurobiologically inspired mechanismsBridging the existing gaps in compliant manipulation

between robots and humans is challenging. One way to solve this problem is to introduce a human-like manipula-tion mechanism to robots. Researchers are therefore turn-ing to neurobiology for inspiration.

Humans are capable of executing a variety of complicat-

bean into the bowl, no mat-ter the starting position of the bean, the effect of gravity and friction will act on the bean such that it will finally come to rest at the bottom of the bowl. ARIE is just like such a bowl in the con-figuration space; the state is the position of the bean, the state-independent input is the gravity, which is a constant in this case, and the stable point is the bot-tom of the bowl. If we can find ARIE and the state-independent input in the configuration space of the robotic system, then the task can be achieved without the need for sensing information.

ARIE enables the uncertainty of the system state to be elimi-nated by a state-independent input (meaning that the input is not the function of the state of the system). It has been implemented in robotic grasping and localization tasks, and has become a common fea-ture in the configuration space of robotic systems (4–7). Figure 2 shows an illustration of the performance of an ARIE system in a typical round-peg, round-hole insertion task. According to the relationship between the shape of the peg and the hole in the physical space and that of the constrained region in the configuration space, a series of compliant strategies to achieve peg insertion can be designed.

Although environmental constraints can be used to develop compliance, some a priori knowledge of the robot and the object to be manipulated is still necessary in the design of compliant robotic applications. Researchers are now focusing on integrating human-type features into robotics to achieve compliant functionality. Here we review how applying neurobiological paradigms can improve robotic compliance.

Adopting compliant functionalityA central goal of compliance research has been to

introduce the functionality of human hands to robots. Methods for compliant control and motion have been pro-posed, and a large number of remote center compliance (RCC) devices have been developed. An RCC device is a mechanical device that moves the center of compliance in an automated assembly such that object jamming can be prevented. However, these devices remain deficient in comparison to the capabilities of human hands. How do we therefore impart features of human control to robots, and how can we integrate these features to improve compli-ance? Humans do not possess the most highly evolved sensory organs; nonetheless, they are capable of many sophisticated operations. In fact, the ability to deal with a complex real-world environment in a flexible manner is one of the most striking features of human beings. Even with

FIGURE 3. An illustration of the constrained region in environment (CRIE) framework.

COMPLIANT ROBOTIC MANIPULATION: A NEUROBIOLOGIC STRATEGY 49

ed tasks with high dexterity through coordination be-tween their hands, eyes, and limbs. The combination of complex anatomi-cal structure with dense, tactile sen-sory organs and the complex motion control that human hands possess is essential for achiev-ing such a range of tasks. For example, the fingers carry the most densely innervated areas of the body and the richest source of tactile feedback,

and possess the greatest positioning capability of the body. Thus, mimicking the structure, function, and control of human hands provides an excellent model for improv-ing the compliance ability of robotic hands. Dexterous robotic hands have received increasing attention in recent years and have resulted in a number of human hand imitations, such as the Shadow Dexterous Hand, the Bar-rett Hand, Robotiq Grippers, and RightHand Labs’ ReFlex Hand. Compared with simple grippers or other rudimen-tary mechanical hands, dexterous robot hands provide more flexibility and improve the user’s ability to execute grasping and manipulation tasks. These capabilities are due to a variety of visual, force/torque, or tactile sensors, and related intelligent algorithms incorporated into the design (8–10).

Although such robotic hands can achieve simple compli-ant manipulations, the human hand has a more intricate and complex structure. There are 27 bones in the human hand. Together with the forearm, more than 30 individual muscles work together to achieve flexibility, precise control, and gripping strength for various tasks. The human hand is controlled by two groups of muscles: extrinsic muscles (containing long flexors and extensors) located on the forearm, which control crude movement, and intrinsic muscles located within the hand, which perform fine motor functions. The most flexible parts of the hand are the fin-gers, which are mostly controlled by strong muscles in the forearm (11). The mechanisms underlying hand movement are closely connected to the motor nervous system. Under-standing this connection may help in developing robotic systems with enhanced compliance abilities.

In recent years, researchers have gained inspiration from studying the makeup of the nervous system, and have begun to apply their ideas to robotic compliance design. One such idea is that a tendon-muscle–based structure could provide a compliant strategy to control movement of the robot hand and arm. In this scenario, neural signals

to control the hand would interface with a design whereby the hand could turn dynamically, providing compliance in a variety of situations. The mechanical structure of the robotic hand can be designed in a compliant manner based on the anatomical structure of the human hand. It would also be necessary to provide a muscle-like driving structure that would function as the hardware of the motor nervous system. Figure 4 illustrates the idea of a tendon-muscle–based struc-ture with compliant strategy enabling a robot hand to insert a key into a lock, a task that would require a high degree of compliance.

Conclusions and perspectives

Developing robots with human-like tactile abilities has been a long-standing goal of robotics. The field of compli-ance for robotic manipulation has developed many remark-able strategies, ranging from combining compliant func-tionality with environmental constraints and coarse-sensing information, to integrating neurobiologically inspired mecha-nisms into robotic hand design—all paving the way for many promising robotic applications. Exploring neurobiologically inspired paradigms for compliance can also provide a better understanding of how the human body carries out manipu-lation tasks. We plan to design and test a neurobiologically inspired robotic hand with a muscle-like structure; this system will integrate a neurobiologically inspired model of a motor nervous system compliance-based technology in the robotic hand, which could lead to a low-cost and fast robotic manipulation system with an inherent learning ability.

References 1. T. Lozano-Perez, M. T. Mason, R. H. Taylor, Int. J. Robot. Res. 3, 3–24 (1984). 2. M. R. Ahmed, Compliance Control of Robot Manipulator for Safe Physical Human Robot Interaction, Ph.D. thesis, Örebro University (2011). 3. E. Mattar, Rob. Auton. Syst. 61, 517–544 (2013). 4. H. Qiao, Int. J. Prod. Res. 40, 975–1002 (2002). 5. J. Su, H. Qiao, C. Liu, Z. Ou, Int. J. Adv. Manuf. Tech. 59, 1211– 1225 (2012). 6. H. Qiao, M. Wang, J. Su, S. Jia, R. Li, IEEE/ASME Transactions on Mechatronics 20, 2311–2327 (2015). 7. J. Su, H. Qiao, Z. Ou, Y. Zhang, Assembly Autom. 32, 86–99 (2012). 8. T. Iberall, Int. J. Robot. Res. 16, 285–299 (1997). 9. A. Bicchi, R. Sorrentino, Proc. IEEE International Conference on Robotics and Automation 1, 452–457 (1995).10. A. Bicchi, IEEE Trans. Robot. Autom. 16, 652–662 (2000).11. R. Tubiana, J. M. Thomine, E. Mackin, Examination of the Hand and Wrist, 2nd Ed. (Martin Dunitz, London, 1998).

AcknowledgmentsThis research was partially supported by the National Natural Science Foundation of China (61210009), the Strategic Prior-ity Research Program of the Chinese Academy of Sciences (XDB02080003), the Beijing Municipal Science and Technology Commission (D16110400140000 and D161100001416001), and the Fundamental Research Funds for the Central Universities (FRF-TP-15-115A1).

FIGURE 4. An illustration of a task requiring a high level of compliance: inserting a key to a lock.


Declarative and procedural knowledge modeling methodology for brain cognitive function analysis

Yun Su1,2, Xiaowei Zhang1, Philip Moore1, Jing Chen1, Xu Ma1, and Bin Hu1*

Driven by the rapid ongoing advances in neuroscience, cognitive science, and computer science, brain-inspired intelligence R&D is blossoming (1). Brain-inspired intelligence is a field that addresses the creation of machine intelligence through computational modeling and software—hardware coordination, drawing inspiration from the information-processing mechanisms of the human brain at various levels of granularity (2). Research in this field has led to the development of intelligence control architectures in humanoid robots. The area of humanoid robotics has gained traction as a concrete example of the accomplishments of brain-inspired intelligence research and as a research tool for neural simulation.

In this review, we discuss how ontology technologies and data mining algorithms are being used to model knowledge of multimodal psychophysiological data including electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI), with the aim of predicting the affective state and stage of mild cognitive impairment (MCI). Such methods hold the potential to model memory and enable effective management of inputs (sensory information), and to provide an effective basis for the conversion of the information to an invariant form that can be processed to infer and predict new knowledge.

Modeling memoryAt the crux of intelligence is the ability to implement

memory and predictions (3). The brain uses a vast amount of memory to create a model of the world and makes continuous predictions of future events. When we see, feel, hear, or experience external sensory stimulation, the cortex takes the detailed, highly specific input and converts it to an invariant form, which captures the essence of relationships within the observed world, independent of the details of the moment. When there are consistent patterns among the sensory inputs flowing into the brain, the cortex will process them to retrieve the invariant form from memories and combine patterns of the invariant structure with the most recent details to predict future events.

When considering humanoid robotics, one of the key

challenges is how to model human memory, whether in designing hardware to deal with large-scale brain simulations or software with brain-inspired cognitive architectures. Declarative memory (i.e., the knowledge of “what”) and procedural memory (i.e., the knowledge of “how”) are the two important kinds of memory that represent declarative knowledge and procedural knowledge, respectively. They have been implemented as important components in many humanoid robotics and underlying brain-inspired cognitive architectures, such as Edelman’s neural automata (4), ACT–R (adaptive control of thought–rational) (5), and 4CAPS (cortical capacity-constrained concurrent activation-based production system) (6).

Memory has a complicated and hierarchical structure that stores information about important relationships between entities in the world. Similarly, ontological technology can provide linguistic, semantic, and quantitative meaning by showing relationships between concepts. This technology provides a suitable and effective model in a form readable by both computerized systems and humans. Moreover, it enables the presentation of the hierarchical declarative meaning of concepts along with modeling of procedural knowledge, and can use machine learning technologies and statistical methods to infer and predict new knowledge. Thus, ontology-based modeling can be applied to declarative and procedural knowledge modeling in the memory systems of humanoid robots. By combining ontological techniques with data mining algorithms, we have conducted studies of declarative knowledge modeling of psychophysiological signals and procedural knowledge modeling of both emotion assessment and early-stage prediction of MCI (7–11).

EEG-based emotion assessment

Emotion is considered a critical aspect of intelligent human behavior. Current research has shown an increasing interest in using emotion assessment to more successfully apply brain-inspired intelligence toward improving human-machine interactions. Based on the analysis of EEGs, which are derived from automatic nervous system responses, computers can assess user emotions and find correlations between significant EEG features and the human emotional state. With the advent of modern signal processing techniques, the evaluative power of EEG-based analysis of human emotion has increased, due in part to the large number of features typically extracted from EEG signals. However, it is too complex a task for researchers to manually manage the expanded set of features in a structured way.

Management and distribution of the data present significant challenges, which can only be addressed through advanced tools for data representation, mining, and integration. We have therefore carried out studies to develop tools for implementing declarative knowledge and procedural knowledge models. To build declarative knowledge models, we extract knowledge from EEG data and contextual information, then encode the knowledge and information to create the domain ontology. We address

1School of Information Science and Engineering, Lanzhou University, Lanzhou, China2College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China*Corresponding Author: [email protected]


DECLARATIVE AND PROCEDURAL KNOWLEDGE MODELING METHODOLOGY FOR BRAIN COGNITIVE FUNCTION ANALYSIS 51

FIGURE 1. EEG-based emotion assessment system using ontology and data mining techniques. (A) EmotionO, an ontology developed for the EEG context (8). (B) Framework of EmotionO+, an ontology model used for early-stage prediction and intervention in depression (9). (C) The mapping process from raw EEG data documents to the Rawdata ontology with EEG data (10). EEG, electroencephalogram; fNIRS, functional near-infrared spectroscopy.

A

B

C


using the random forest algorithm. This model can be used in systems designed to enable early-stage prediction and interventions in depressive disorders (Figure 1B) (9).

On the basis of the work of Zhang, Chen et al. have proposed a mapping process that can convert unstructured raw EEG documents into an ontology called “Rawdata” with structured EEG data, using document

procedural knowledge using a rule set that has been trained by the data mining algorithms, which can identify the emotional state of the subject from the declarative knowledge.

Zhang et al. have developed an ontology called “Emotiono,” which models declarative knowledge that includes contextual information and EEG-related knowledge, as well as the relationships between them. Figure 1A shows a representation of some key entities defined in this ontology. The key top-level elements consist of classes and properties that describe #Emotion, #User, and #Situation concepts [ontology web language (OWL) classes] with <hasEEGFeature> and <hasEmotion> properties. In order to obtain the emotional state, the C4.5 algorithm was used to mine the main relations between EEG features of a certain person and their affective state, with all of the paths from the root node to the leaf node in the constructed decision tree treated as the procedural knowledge. Knowledge reasoning was based on the support of an inference engine that concatenates declarative knowledge associated with procedural knowledge, and was constructed to identify the current emotional state from the captured data. Experimental results were based on the eNTERFACE’06 (www.enterface.net/results) and DEAP (www.eecs.qmul.ac.uk/mmv/datasets/deap) databases, and showed that recognition of human emotion is highly accurate using this approach (7, 8).

Based on the Emotiono ontology, Su et al. proposed an extended ontology model called “EmotionO+” that semantically represents EEG and functional near-infrared spectroscopy (fNIRS) data with contextual information. In addition, their proposed ontology model has incorporated emotion-reasoning rules based on EEG data obtained

A

B

FIGURE 2. MRI-based decision support system for the diagnosis of MCI using ontology and data mining techniques (11). (A) The structure of mild cognitive impairment (MCI) domain ontology. (B) The framework of decision support for MCI diagnosis. ROI, region of interest; NMR, nuclear magnetic resonance; CDD, cognitive deterioration diagnosis; SVM, support vector machine; MRI, magnetic resonance imaging.

DECLARATIVE AND PROCEDURAL KNOWLEDGE MODELING METHODOLOGY FOR BRAIN COGNITIVE FUNCTION ANALYSIS 53

content identification, query, and information description mechanisms (Figure 1C) (10). Several OWL classes have been defined in the Rawdata ontology, but they have no individuals (i.e., the members of the classes) or explicit properties (i.e., the relationships between the OWL classes) [Figure 1C(a)]. Querying and searching are used in both the Rawdata ontology and the raw data documents in order to match keywords, and individuals are built under the corresponding classes [Figure 1C(b) and (c)]. Following these initial steps, the raw EEG data and the data-related content are represented in the Rawdata ontology, which includes explicit individuals, properties, and the relationships between individuals [Figure 1C(d)]. Finally, the structured description of the raw EEG data can be further processed using the quantitative EEG analysis mechanisms and ontology modeling.

MRI-based decision support for diagnosis of MCI In the medical domain, ontology-based modeling

has been used as a data structure, as it enables the definition of semantic descriptors, data integration, and management of domain knowledge in clinical health care. Given these attributes, we have studied ontology-driven decision support for the diagnosis of MCI based on MRI.

Brain structure changes associated with MCI have been widely researched. In structural MRI studies, for example, the cortical thickness of patients suffering from MCI has been shown to be significantly reduced due to gray matter atrophy (12). To improve diagnosis, we propose an objective method capable of diagnosing MCI automatically using the cortical thickness of each region of interest in the MRI scan. The management of MRI/MCI knowledge hierarchically using ontology-based modeling may provide significant diagnostic benefits.

Based on this hypothesis, Zhang et al. encoded specialized declarative knowledge—including MRI/MCI and environment-context knowledge—into an ontology (Figure 2A), and mined the procedural knowledge constructed by a rule set using the C4.5 machine learning algorithm (11). Following these steps, the two parts were applied in conjunction with a reasoning engine to automatically distinguish MCI patients from normal controls (Figure 2B). The rule set was trained by MRI data from a population of 187 MCI patients and 177 normal controls selected from the Alzheimer’s Disease Neuroimaging Initiative database (adni.loni.usc.edu). Evaluation of the results supported the conclusion that the approach can assist physicians and health care professionals to effectively implement clinical diagnosis of MCI under real-world conditions.

Discussion and conclusions In summary, our studies have shown that the

proposed models can be used to characterize declarative knowledge and procedural knowledge. In the proposed approach, emotion-context, MCI-context,

and physiological data can be modeled as declarative knowledge based on ontology components that include classes, individuals, and properties. The relationships between emotion, MCI, and the physiological data can then be constructed as procedural knowledge using data mining techniques. The emotional state or the stage of the disease can then be predicted from the captured data, based on knowledge reasoning supported by an inference engine that concatenates declarative knowledge associated with procedural knowledge.

Our proposed model enables the management of structured or unstructured data, converts the data into an invariant form (i.e., knowledge), and predicts future events (i.e., new knowledge). Such methods have great potential for use in brain-inspired intelligence technologies that demand the modeling of the declarative content and procedural content recorded in memory.

References 1. H. de Garis, C. Shuo, B. Goertzel, L. Ruiting, Neurocomputing 74, 3–29 (2010). 2. Y. Zeng, Cheng-Lin Liu, Tie-Niu Tan, Chin. J. Comput. 39, 212–222 (2016). 3. J. Hawkins, S. Blakeslee, On Intelligence (Macmillan, London, 2007), pp. 60–71. 4. G. M. Edelman, Science 318, 1103–1105 (2007). 5. J. R. Anderson, C. Lebiere, Behav. Brain Sci. 26, 587–601 (2003). 6. M. A. Just, S. Varma, Cogn. Affect. Behav. Neurosci. 7, 153– 191 (2007). 7. X. Zhang, B. Hu, P. Moore, J. Chen, L. Zhou, in 18th International Conference on Neural Information Processing (ICONIP 2011), Part II (Springer, Berlin/Heidelberg, Germany, 2011), pp. 89–98. 8. X. Zhang, B. Hu, J. Chen, P. Moore, World Wide Web 16, 479–513 (2013). 9. Y. Su, B. Hu, L. Xu, H. Cai, P. Moore, Proceedings of the 2014 IEEE International Conference on Bioinformatics and Biomedicine (IEEE, Belfast, United Kingdom, November 2014), pp. 529–535.10. J. Chen, B. Hu, P. Moore, X. Zhang, X. Ma, Appl. Soft Comput. 30, 663–674 (2015).11. X. Zhang, B. Hu, X. Ma, P. Moore, J. Chen, Comput. Methods Programs Biomed. 113, 781–791 (2014).12. Z. Yao, B. Hu, C. Liang, L. Zhao, M. Jackson, PLOS ONE 7, e48973 (2012).

AcknowledgmentsThis work was supported by the National Basic Research Program of China (973 Program) (2014CB744600), the National Natural Science Foundation of China (61210010 and 61402211), the Natural Science Foundation of the Education Department of Gansu Province, China (2015A-008), and the Northwest Normal University Foundation (NWNU-LKQN-14-5).

brain-inspired intelligent robotics (pdf, 6 mb) - science · group director of robotic theory of...

Documents