Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- Part I Introduction
- Part II Neuromorphic robots: biologically and neurally inspired designs
- Part III Brain-based robots: architectures and approaches
- 5 The RatSLAM project: robot spatial navigation
- 6 Evolution of rewards and learning mechanisms in Cyber Rodents
- 7 A neuromorphically inspired architecture for cognitive robots
- 8 Autonomous visuomotor development for neuromorphic robots
- 9 Brain-inspired robots for autistic training and care
- Part IV Philosophical and theoretical considerations
- Part V Ethical considerations
- Index
- References
6 - Evolution of rewards and learning mechanisms in Cyber Rodents
from Part III - Brain-based robots: architectures and approaches
Published online by Cambridge University Press: 05 February 2012
- Frontmatter
- Contents
- Contributors
- Preface
- Part I Introduction
- Part II Neuromorphic robots: biologically and neurally inspired designs
- Part III Brain-based robots: architectures and approaches
- 5 The RatSLAM project: robot spatial navigation
- 6 Evolution of rewards and learning mechanisms in Cyber Rodents
- 7 A neuromorphically inspired architecture for cognitive robots
- 8 Autonomous visuomotor development for neuromorphic robots
- 9 Brain-inspired robots for autistic training and care
- Part IV Philosophical and theoretical considerations
- Part V Ethical considerations
- Index
- References
Summary
Finding the design principle of reward functions is a big challenge in both artificial intelligence and neuroscience. Successful acquisition of a task usually requires rewards to be given not only for goals but also for intermediate states to promote effective exploration. We propose a method to design “intrinsic” rewards for autonomous robots by combining constrained policy gradient reinforcement learning and embodied evolution. To validate the method, we use the Cyber Rodent robots, in which collision avoidance, recharging from battery pack, and “mating” by software reproduction are three major “extrinsic” rewards. We show in hardware experiments that the robots can find appropriate intrinsic rewards for the visual properties of battery packs and potential mating partners to promote approach behaviors.
Introduction
In application of reinforcement learning (Sutton and Barto, 1998) to real-world problems, the design of the reward function is critical for successful achievement of the task. Designing appropriate reward functions is a nontrivial, time-consuming process in practical applications. Although it appears straightforward to assign positive rewards to desired goal states and negative rewards to states to be avoided, finding a good balance between multiple rewards often needs careful tuning. Furthermore, if rewards are given only at isolated goal states, blind exploration of the state space takes a long time. Rewards at intermediate subgoals, or even along the trajectories leading to the goal, promote focused exploration, but appropriate design of such additional rewards usually requires prior knowledge of the task or trial and error by the experimenter.
- Type
- Chapter
- Information
- Neuromorphic and Brain-Based Robots , pp. 109 - 128Publisher: Cambridge University PressPrint publication year: 2011