Hostname: page-component-848d4c4894-p2v8j Total loading time: 0 Render date: 2024-06-07T21:07:42.192Z Has data issue: false hasContentIssue false

A review of learning planning action models

Published online by Cambridge University Press:  21 November 2018

Ankuj Arora
Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr
Humbert Fiorino
Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr
Damien Pellier
Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr
Marc Métivier
Affiliation:
Laboratoire d’informatique de Paris Descartes, Université Paris-Descartes, 45 rue des Saints-Pŕes 75006 Paris, France e-mail: marc.metivier@parisdescartes.fr
Sylvie Pesty
Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr

Abstract

Automated planning has been a continuous field of study since the 1960s, since the notion of accomplishing a task using an ordered set of actions resonates with almost every known activity domain. However, as we move from toy domains closer to the complex real world, these actions become increasingly difficult to codify. The reasons range from intense laborious effort, to intricacies so barely identifiable, that programming them is a challenge that presents itself much later in the process. In such domains, planners now leverage recent advancements in machine learning to learn action models, that is, blueprints of all the actions whose execution effectuates transitions in the system. This learning provides an opportunity for the evolution of the model toward a version more consistent and adapted to its environment, augmenting the probability of success of the plans. It is also a conscious effort to decrease laborious manual coding and increase quality. This paper presents a survey of the machine learning techniques applied for learning planning action models. It first describes the characteristics of learning systems. It then details the learning techniques that have been used in the literature during the past decades, and finally presents some open issues.

Type
Principles and Practice of Multi-Agent Systems
Copyright
© Cambridge University Press, 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agrawal, R. & Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, 487–499.Google Scholar
Balac, N., Gaines, D. M. & Fisher, D. 2000. Learning action models for navigation in noisy environments. In ICML Workshop on Machine Learning of Spatial Knowledge.Google Scholar
Bevacqua, G., Cacace, J., Finzi, A. & Lippiello, V. 2015. Mixed-initiative planning and execution for multiple drones in search and rescue missions. In ICAPS, pp. 315–323.Google Scholar
Brafman, R. I. & Domshlak, C. 2008. From one to many: planning for loosely coupled multi-agent systems. In ICAPS,28–35.Google Scholar
Bresina, J. & Morris, P. 2007. Mixed-initiative planning in space mission operations. AI Magazine 28(2), 75.Google Scholar
Cashmore, M., Fox, M., Long, D., Ridder, B. C. & Magazzeni, D. 2016a. Opportunistic planning for increased plan utility. In Proceedings of the 4th ICAPS Workshop on Planning and Robotics (PlanRob 2016), 82–92.Google Scholar
Cashmore, M., Fox, M., Long, D., Ridder, B. C. & Magazzeni, D. 2016b. Strategic planning for autonomous systems over long horizons. In Proceedings of the 4th ICAPS Workshop on Planning and Robotics (PlanRob 2016), 74–81.Google Scholar
Cohen, L., Shimony, S. E. & Weiss, G. 2015. Estimating the probability of meeting a deadline in hierarchical plans. In Computational Logic in Multi-Agent Systems, 243–258.Google Scholar
Cresswell, S. & Gregory, P. 2011. Generalised Domain Model Acquisition from Action Traces. In International Conference on Automated Planning and Scheduling.Google Scholar
Cresswell, S. N., McCluskey, T. L. & West, M. M. 2009. Acquisition of object-centered domain models from planning examples.Google Scholar
Cresswell, S. N., McCluskey, T. L. & West, M. M. 2013. Acquiring planning domain models using LOCM. The Knowledge Engineering Review 28(2), 195213.Google Scholar
Croonenborghs, T., Ramon, J., Blockeel, H. & Bruynooghe, M. 2007. Online learning and exploiting relational models in reinforcement learning. In International Joint Conference on Artificial Intelligence, 726–731.Google Scholar
De Jonge, F., Roos, N. & Witteveen, C. 2009. Primary and secondary diagnosis of multi-agent plan execution. Autonomous Agents and Multi-Agent Systems 18(2), 267294.Google Scholar
Deshpande, A., Milch, B., Zettlemoyer, L. S. & Kaelbling, L. 2007. Learning probabilistic relational dynamics for multiple tasks. In Probabilistic, Logical and Relational Learning - A Further Synthesis.Google Scholar
Di Rocco, M., Pecora, F. & Saffiotti, A. 2013. Closed loop configuration planning with time and resources. Planning and Robotics, 36.Google Scholar
Driessens, K., Ramon, J. & Blockeel, H. 2001. Speeding up relational reinforcement learning through the use of an incremental first order decision tree learner. In European Conference on Machine Learning, 97–108.Google Scholar
Džeroski, S., De Raedt, L. & Driessens, K. 2001. Relational reinforcement learning. Machine Learning 43(1–2), 752.Google Scholar
Endsley, M. R. & Garland, D. J. 2000. Theoretical underpinnings of situation awareness: a critical review. Situation Awareness Analysis and Measurement 1, 332.Google Scholar
Fikes, R. E. & Nilsson, N. J. 1971. STRIPS: a new approach to the application of theorem proving to problem solving. Artificial Intelligence 2(3–4), 189208.Google Scholar
Ferrer-Mestres, J., Frances, G. & Geffner, H. 2015. Planning with state constraints and its application to combined task and motion planning. In Proceedings of Workshop on Planning and Robotics (PLANROB), 13–22.Google Scholar
García-Martínez, R. & Borrajo, D. 2000. An integrated approach of learning, planning, and execution. Journal of Intelligent and Robotic Systems 29(1), 4778.Google Scholar
Ghallab, M., Nau, D. & Traverso, P. 2004. Automated Planning: Theory & Practice. Elsevier.Google Scholar
Gil, Y. 1992. Acquiring Domain Knowledge for Planning by Experimentation. PhD thesis, Department of Computer Science, Carnegie-Mellon University.Google Scholar
Guillame-Bert, M. & Crowley, J. L. 2012. Learning temporal association rules on symbolic time sequences. In Asian Conference on Machine Learning, 159–174.Google Scholar
Gregory, P. & Cresswell, S. 2015. Domain model acquisition in the presence of static relations in the LOP system. In ICAPS, 97–105.Google Scholar
Gregory, J., Fink, J., Rogers, J., Gupta, S. & Crowley, J. L. 2016. A risk-based framework for incorporating navigation uncertainty into exploration strategies. In Proceedings of the 4th ICAPS Workshop on Planning and Robotics (PlanRob 2016), 176–183.Google Scholar
Gregory, P. & Lindsay, A. 2016. Domain model acquisition in domains with action costs. In Proceedings of the Twenty-Sixth International Conference on International Conference on Automated Planning and Scheduling,149–157. AAAI Press.Google Scholar
Inoue, K., Ribeiro, T. & Sakama, C. 2014. Learning from interpretation transition. Machine Learning 94(1), 5179.Google Scholar
Jaidee, U., Muñoz-Avila, H. & Aha, D. W. 2011. Integrated learning for goal-driven autonomy. In IJCAI, 22 (3): 2450.Google Scholar
Jilani, R., Crampton, A., Kitchin, D. E. & Vallati, M. 2014. Automated knowledge engineering tools in planning: state-of-the-art and future challenges. In Knowledge Engineering for Planning and Scheduling.Google Scholar
Jilani, R., Crampton, A., Kitchin, D. & Vallati, M. 2015. ASCoL: a tool for improving automatic planning domain model acquisition. In Congress of the Italian Association for Artificial Intelligence, 438–451. Springer International Publishing.Google Scholar
Jiménez, S., De La Rosa, T., Fernández, S., Fernández, F. & Borrajo, D. 2012. A review of machine learning for automated planning. The Knowledge Engineering Review 27(4), 433467.Google Scholar
Jiménez, S., Fernández, F. & Borrajo, D. 2008. The PELA architecture: integrating planning and learning to improve execution. In Association for the Advancement of Artificial Intelligence.Google Scholar
Kalech, M. 2012. Diagnosis of coordination failures: a matrix-based approach. Autonomous Agents and Multi-Agent Systems 24(1), 69103.Google Scholar
Martínez, D., Alenya, G., Torras, C., Ribeiro, T. & Inoue, K. 2016. Learning relational dynamics of stochastic domains for planning. In Proceedings of the 26th International Conference on Automated Planning and Scheduling.Google Scholar
McCluskey, T. L., Cresswell, S. N., Richardson, N. E. & West, M. M. 2009. Automated acquisition of action knowledge.Google Scholar
McCluskey, T. L., Richardson, N. E. & Simpson, R. M. 2002. An interactive method for inducing operator descriptions. In Artificial Intelligence Planning Systems, 121–130.Google Scholar
McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D. & Wilkins, D. 1998. PDDL-the planning domain definition language.Google Scholar
Micalizio, R. & Torasso, P. 2014. Cooperative monitoring to diagnose multiagent plans. Journal of Artificial Intelligence Research 51, 170.Google Scholar
Molineaux, M. & Aha, D. W. 2014. Learning unknown event models. In Association for the Advancement of Artificial Intelligence, 395401.Google Scholar
Molineaux, M., Klenk, M. & Aha, D. W. 2010. Goal-Driven Autonomy in a Navy Strategy Simulation. Knexus Research Corp.Google Scholar
Mourão, K. 2012. Learning action representations using kernel perceptrons.Google Scholar
Mourão, K. 2014. Learning probabilistic planning operators from noisy observations. In Proceedings of the Workshop of the UK Planning and Scheduling Special Interest Group.Google Scholar
Mourão, K., Petrick, R. P. & Steedman, M. 2008. Using kernel perceptrons to learn action effects for planning. In International Conference on Cognitive Systems (CogSys 2008), 45–50.Google Scholar
Mourão, K., Petrick, R. P. & Steedman, M. 2010. Learning action effects in partially observable domains. In European Conference on Artificial Intelligence, 973–974.Google Scholar
Mourão, K., Zettlemoyer, L. S., Petrick, R. & Steedman, M. 2012. Learning STRIPS Operators from Noisy and Incomplete Observations. In Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 614–623.Google Scholar
Muñoz-Avila, H., Aha, D. W., Breslow, L. & Nau, D. 1999. HICAP: an interactive case-based planning architecture and its application to noncombatant evacuation operations. In Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, 870–875.Google Scholar
Muscettola, N., Nayak, P. P., Pell, B. & Williams, B. C. 1998. Remote agent: to boldly go where no AI system has gone before. Artificial Intelligence 103(1–2), 547.Google Scholar
Newton, M. A. H. & Levine, J. 2010. Implicit learning of compiled macro-actions for planning. In European Conference on Artificial Intelligence, 323–328.Google Scholar
Newton, M. A. H., Levine, J., Fox, M. & Long, D. 2007. Learning macro-actions for arbitrary planners and domains. In International Conference on Automated Planning and Scheduling, 256–263.Google Scholar
Newton, M. H., Levine, J., Fox, M. & Long, D. 2008. Learning macros that are not captured by given example plans. In Poster Papers at the International Conference on Automated Planning and Scheduling.Google Scholar
Nilsson, N. J. 1984. Shakey the Robot. Sri International.Google Scholar
Ortiz, J., García-Olaya, A. & Borrajo, D. 2013. Using Roni Stern for building planning action models. International Journal of Distributed Sensor Networks 9(6), 942347.Google Scholar
Pan, S. J. & Yang, Q. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 13451359.Google Scholar
Pasula, H., Zettlemoyer, L. S. & Kaelbling, L. P. 2004. Learning probabilistic relational planning rules. In International Conference on Automated Planning and Scheduling, 73–82.Google Scholar
Pasula, H., Zettlemoyer, L. S. & Kaelbling, L. P. 2007. Learning symbolic models of stochastic domains. In Journal of Artificial Intelligence Research, 309–352.Google Scholar
Pednault, E. P. 1989. ADL: exploring the middle ground between STRIPS and the situation calculus. Kr 89, 324332.Google Scholar
Pell, B., Gat, E., Keesing, R., Muscettola, N. & Smith, B. 1997. Robust periodic planning and execution for autonomous spacecraft. In IJCAI, 1234–1239.Google Scholar
Quinlan, J. R. 1986. Induction of decision trees. Machine Learning 1(1), 81106.Google Scholar
Ranasinghe, N. & Shen, W. 2008. Surprise-based learning for developmental robotics. In Learning and Adaptive Behaviors for Robotic Systems, 65–70.Google Scholar
Richardson, M. & Domingos, P. 2006. Markov logic networks. Machine Learning 62(1–2), 107136.Google Scholar
Rodrigues, C., Gérard, P., Rouveirol, C. & Soldano, H. 2011. Active learning of relational action models. In International Conference on Inductive Logic Programming, 302–316.Google Scholar
Sadohara, K. 2001. Learning of boolean functions using support vector machines. In International Conference on Algorithmic Learning Theory, 106–118.Google Scholar
Safaei, J. & Ghassem-Sani, G. 2007. Incremental learning of planning operators in stochastic domains. In International Conference on Current Trends in Theory and Practice of Computer Science, 644–655.Google Scholar
Sanner, S. 2010. Relational Dynamic Influence Diagram Language (rddl): Language Description. Unpublished Manuscript, Australian National University.Google Scholar
Shah, M., Chrpa, L., Jimoh, F., Kitchin, D., McCluskey, T., Parkinson, S. & Vallati, M. 2013. Knowledge engineering tools in planning: state-of-the-art and future challenges. Knowledge Engineering for Planning and Scheduling 53, 53.Google Scholar
Shen, W. M. 1993. Discovery as autonomous learning from the environment. Machine Learning 12(1–3), 143165.Google Scholar
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M. & Dieleman, S. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484489.Google Scholar
Stern, R. & Brendan, J. 2017. Efficient, safe, and probably approximately complete learning of action models. In IJCAI.Google Scholar
Strenzke, R. & Schulte, A. 2011. The MMP: A mixed-initiative mission planning system for the multi-aircraft domain. In Proceeding of the International Conference on Automated Planning and Scheduling, 74–82.Google Scholar
Stulp, F., Fedrizzi, A., Mösenlechner, L. & Beetz, M. 2012. Learning and reasoning with action-related places for robust mobile manipulation. Journal of Artificial Intelligence Research 43, 142.Google Scholar
Sutton, R. S. & Barto, A. G. 1998. Reinforcement Learning: An Introduction. MIT Press.Google Scholar
Walsh, T. J. & Littman, M. L. 2008. Efficient learning of action schemas and web-service descriptions. In Association for the Advancement of Artificial Intelligence, 714–719.Google Scholar
Wang, X. 1996. Learning Planning Operators by Observation and Practice. Doctoral dissertation, Carnegie Mellon University.Google Scholar
Weber, B. G., Mateas, M. & Jhala, A. 2012. Learning from demonstration for goal-driven autonomy. In Association for the Advancement of Artificial Intelligence.Google Scholar
Winograd, T. 1972. Understanding natural language. Cognitive Psychology 3(1), 1191.Google Scholar
Yang, Q., Wu, K. & Jiang, Y. 2007. Learning action models from plan examples using weighted MAX-SAT. Artificial Intelligence, 171(2–3), 107143.Google Scholar
Yoon, S. & Kambhampati, S. 2007. Towards model-lite planning: a proposal for learning and planning with incomplete domain models. In ICAPS 2007 Workshop on Artificial Intelligence Planning and Learning.Google Scholar
Younes, H. L., Littman, M. L., Weissman, D. & Asmuth, J. 2005. The first probabilistic track of the international planning competition. Journal of Artificial Intelligence Research 24, 851887.Google Scholar
Zettlemoyer, L. S., Pasula, H. & Kaelbling, L. P. 2005. Learning planning rules in noisy stochastic worlds. In Association for the Advancement of Artificial Intelligence, 911–918.Google Scholar
Zhang, Y., Sreedharan, S. & Kambhampati, S. 2015. Capability models and their applications in planning. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 1151–1159.Google Scholar
Zhuo, H. H. & Kambhampati, S. 2013. Action-model acquisition from noisy plan traces. In International Joint Conference on Artificial Intelligence.Google Scholar
Zhuo, H. H., Muñoz-Avila, H. & Yang, Q. 2011. Learning action models for multi-agent planning. In The 10th International Conference on Autonomous Agents and Multiagent Systems, 1, 217–224.Google Scholar
Zhuo, H. H., Nguyen, T. A. & Kambhampati, S. 2013. Refining incomplete planning domain models through plan traces. In International Joint Conference on Artificial Intelligence.Google Scholar
Zhuo, H. H. & Yang, Q. 2014. Action-model acquisition for planning via transfer learning. Artificial Intelligence 212, 80103.Google Scholar
Zhuo, H. H., Yang, Q., Hu, D. H. & Li, L. 2010. Learning complex action models with quantifiers and logical implications. Artificial Intelligence 174(18), 15401569.Google Scholar
Zimmerman, T. & Kambhampati, S. 2003. Learning-assisted automated planning: looking back, taking stock, going forward. AI Magazine 24(2), 73.Google Scholar