Challenges and issues faced in building a framework for conducting research in learning from observation

doi:10.1017/CBO9780511489808.005

3 - Challenges and issues faced in building a framework for conducting research in learning from observation

Published online by Cambridge University Press: 10 December 2009

Darrin Bentivegna ,

Christopher Atkeson and

Gordon Cheng

Edited by

Chrystopher L. Nehaniv and

Kerstin Dautenhahn

Show author details

Darrin Bentivegna: Affiliation:
Kyoto, Japan and Computational Brain Project, ICORP, Japan Science and Technology Agency, Kyoto, Japan
Christopher Atkeson: Affiliation:
Kyoto, Japan and Carnegie Mellon University, Robotics Institute, Pittsburgh, USA
Gordon Cheng: Affiliation:
Kyoto, Japan and Computational Brain Project, ICORP, Japan Science and Technology Agency, Kyoto, Japan
Chrystopher L. Nehaniv: Affiliation:
University of Hertfordshire
Kerstin Dautenhahn: Affiliation:
University of Hertfordshire

Book contents

Get access

Summary

Introduction

We are exploring how primitives, small units of behavior, can speed up robot learning and enable robots to learn difficult dynamic tasks in reasonable amounts of time. In this chapter we describe work on learning from observation and learning from practice on air hockey and marble maze tasks. We discuss our research strategy, results, and open issues and challenges.

Primitives are units of behavior above the level of motor or muscle commands. There have been many proposals for such units of behavior in neuroscience, psychology, robotics, artificial intelligence and machine learning (Arkin, 1998; Schmidt, 1988; Schmidt, 1975; Russell and Norvig, 1995; Barto and Mahadevan, 2003). There is a great deal of evidence that biological systems have units of behavior above the level of activating individual motor neurons, and that the organization of the brain reflects those units of behavior (Loeb, 1989). We know that in human eye movement, for example, there are only a few types of movements including saccades, smooth pursuit, vestibular ocular reflex (VOR), optokinetic nystagmus (OKN) and vergence, that general eye movements are generated as sequences of these behavioral units, and that there are distinct brain regions dedicated to generating and controlling each type of eye movement (Carpenter, 1988). We know that there are discrete locomotion patterns, or gaits, for animals with legs (McMahon, 1984). Whether there are corresponding units of behavior for upper limb movement in humans and other primates is not yet clear.

Type: Chapter
Information: Imitation and Social Learning in Robots, Humans and Animals
Behavioural, Social and Communicative Dimensions
, pp. 47 - 66

DOI: https://doi.org/10.1017/CBO9780511489808.005 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2007

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aboaf, E., Drucker, S. and Atkeson, C. (1989). Task-level robot learning: juggling a tennis ball more accurately. In IEEE International Conference on Robotics and Automation, Scottsdale, AZ, 1290–5.Google Scholar

Arkin, R. C. (1998). Behavior-Based Robotics. Cambridge, MA: MIT Press.Google Scholar

Atkeson, C. G., Moore, A. W. and Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11, 11–73.CrossRef Google Scholar

Balch, T. (1997). Clay: integrating motor schemas and reinforcement learning. Technical Report GIT-CC-97-11, College of Computing, Georgia Institute of Technology, Atlanta, Georgia.

Barto, A. and Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Systems, 13, 41–77.CrossRef Google Scholar

Bentivegna, D. C. (2004). Learning from Observation using Primitives. PhD thesis, Georgia Institute of Technology, Atlanta, GA, USA. http://etd.gatech.edu/theses/available/etd-06202004-213721/.Google Scholar

Bentivegna, D. C., Atkeson, C. G., and Cheng, G. (2003). Learning from observation and practice at the action generation level. In IEEE-RAS International Conference on Humanoid Robotics (Humanoids 2003), Karlsruhe, Germany.Google Scholar

Bentivegna, D. C., Ude, A., Atkeson, C. G., and Cheng, G. (2002). Humanoid robot learning and game playing using PC-based vision. In Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems, Switzerland, Vol. 3, 2449–54.Google Scholar

Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2(1), 14–23.CrossRef Google Scholar

Carpenter, R. (1988). Movements of the Eyes. Pion Press, London, 2nd edn.Google Scholar

Delson, N. and West, H. (1996). Robot programming by human demonstration: adaptation and inconsistency in constrained motion. In IEEE International Conference on Robotics and Automation, Vol. 1, 30–6.Google Scholar

Dietterich, T. G. (1998). The MAXQ method for hierarchical reinforcement learning. In Proceedings of the 15th International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann, 118–26.Google Scholar

Erdmann, M. A. and Mason, M. T. (1988). An exploration of sensorless manipulation. IEEE Journal of Robotics and Automation, 4, 369–79.CrossRef Google Scholar

Faloutsos, P., Panne, M. and Terzopoulos, D. (2001). Composable controllers for physics-based character animation. In Proceedings of SIGGRAPH 2001, Los Angeles, CA, 251–60.Google Scholar

Fod, A., Mataric, M. and Jenkins, O. (2000). Automated derivation of primitives for movement classification. In First IEEE-RAS International Conference on Humanoid Robotics (Humanoids-2000), MIT, Cambridge, MA.Google Scholar

Hovland, G., Sikka, P. and McCarragher, B. (1996). Skill acquisition from human demonstration using a hidden Markov model. In Proceedings of IEEE International Conference on Robotics and Automation, Minneapolis, MN, 2706–11.Google Scholar

Kaiser, M. and Dillmann, R. (1996). Building elementary skills from human demonstration. In Proceedings of the IEE International Conference on Robotics and Automation, 2700–5.Google Scholar

Kandel, E. R., Schwartz, J. H., and Jessell, T. M. (1984). Principles of Neural Science. Norwalle, CT: McGraw-Hill/Appleton and Lange, 4th edn.Google Scholar

Kang, S. B. and Ikeuchi, K. (1993). Toward automatic robot instruction from perception: recognizing a grasp from observation. IEEE International Journal of Robotics and Automation, 9(4), 432–43.Google Scholar

Kuniyoshi, Y., Inaba, M., and Inoue, H. (1994). Learning by watching: extracting reusable task knowledge from visual observation of human performance. IEEE Transactions on Robotics and Automation, 10(6), 799–822.CrossRef Google Scholar

Likhachev, M. and Arkin, R. C. (2001). Spatio-temporal case-based reasoning for behavioral selection. In Proceedings of the 2001 IEEE International Conference on Robotics and Automation, Seoul, Korea, 1627–34.Google Scholar

Lin, L. J. (1993). Hierarchical learning of robot skills by reinforcement. In Proceedings of the 1993 International Joint Conference on Neural Networks, 181–6.Google Scholar

Loeb, G. E. (1989). The functional organization of muscles, motor units, and tasks. In The Segmental Motor System. New York:Oxford University Press, 22–35.Google Scholar

Mataric, M. J., Williamson, M., Demiris, J. and Mohan, A. (1998). Behavior-based primitives for articulated control. In 5th International Conference on Simulation of Adaptive Behavior (SAB-98). Cambridge, MA: MIT Press, 165–70.Google Scholar

McGovern, A. and Barto, A. G. (2001). Automatic discovery of sub-goals in reinforcement learning using diverse density. In Proceedings of the 18th International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann, 361–68.Google Scholar

McMahon, T. A. (1984). Muscles, Reflexes, and Locomotion. Princeton, NJ: Princeton University Press.Google Scholar

Mori, T., Tsujioka, K. and Sato, T. (2001). Human-like action recognition system on whole body motion-captured file. In Proceedings of the 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems, Maui, Hawaii, Vol. 2, 1214–20.Google Scholar

Morimoto, J. and Doya, K. (1998). Hierarchical reinforcement learning of low-dimensional sub-goals and high-dimensional trajectories. In Proceedings of the 5th International Conference on Neural Information Processing, Vol. 2, 850–3.Google Scholar

Pearce, M., Arkin, R. C. and Ram, A. (1992). The learning of reactive control parameters through genetic algorithms. In Proceedings of the 1992 IEEE/RSJ International Conference on Intelligent Robots and Systems, Raleigh, NC, 130–7.Google Scholar

Russell, S. J. and Norvig, P. (1995). Artificial Intelligence: A Modern Approach. Englewood Cliffs, NJ: Prentice Hall.Google Scholar

Ryan, M. and Reid, M. (2000). Learning to fly: an application of hierarchical reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann, 807–14.Google Scholar

Schaal, S. (1997). Learning from demonstration. In Mozer, M. C., Jordan, , M. I., and Petsche, T. (eds.), Advances in Neural Information Processing Systems, Vol. 9. Cambridge, MA: MIT Press. 1040.

Schmidt, R. A. (1975). A schema theory of discrete motor skill learning. Psychological Review, 83, 225–60.CrossRef Google Scholar

Schmidt, R. A. (1988). Motor Learning and Control. Champaign, IL: Human Kinetics Publishers.Google Scholar

Spong, M. W. (1999). Robotic air hockey. http://cyclops.csl.uiuc.edu.

Tung, C. and Kak, A. (1995). Automatic learning of assembly tasks using a dataglove system. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Vol. 1.Google Scholar

Watkins, C. and Dayan, P. (1992). Q learning. Machine Learning, Vol. 8, 279–92.Google Scholar

Wooten, W. L. and Hodgins, J. K. (2000). Simulating leaping, tumbling, landing and balancing humans. In IEEE International Conference on Robotics and Automation, Vol. 1, 656–62.Google Scholar

Book contents

3 - Challenges and issues faced in building a framework for conducting research in learning from observation

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive