A review of learning planning action models

Ankuj Arora; Humbert Fiorino; Damien Pellier; Marc Métivier; Sylvie Pesty

doi:10.1017/S0269888918000188

A review of learning planning action models

Published online by Cambridge University Press: 21 November 2018

Marc Métivier and

Ankuj Arora: Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr
Humbert Fiorino: Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr
Damien Pellier: Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr
Marc Métivier: Affiliation:
Laboratoire d’informatique de Paris Descartes, Université Paris-Descartes, 45 rue des Saints-Pŕes 75006 Paris, France e-mail: marc.metivier@parisdescartes.fr
Sylvie Pesty: Affiliation:
Université Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France e-mail: Ankuj.Arora@univ-grenoble-alpes.fr, Humbert.Fiorino@univ-grenoble-alpes.fr, Damien.Pellier@univ-grenoble-alpes.fr, Sylvie.Pesty@univ-grenoble-alpes.fr

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Automated planning has been a continuous field of study since the 1960s, since the notion of accomplishing a task using an ordered set of actions resonates with almost every known activity domain. However, as we move from toy domains closer to the complex real world, these actions become increasingly difficult to codify. The reasons range from intense laborious effort, to intricacies so barely identifiable, that programming them is a challenge that presents itself much later in the process. In such domains, planners now leverage recent advancements in machine learning to learn action models, that is, blueprints of all the actions whose execution effectuates transitions in the system. This learning provides an opportunity for the evolution of the model toward a version more consistent and adapted to its environment, augmenting the probability of success of the plans. It is also a conscious effort to decrease laborious manual coding and increase quality. This paper presents a survey of the machine learning techniques applied for learning planning action models. It first describes the characteristics of learning systems. It then details the learning techniques that have been used in the literature during the past decades, and finally presents some open issues.

Type: Principles and Practice of Multi-Agent Systems
Information: The Knowledge Engineering Review , Volume 33 , 2018 , e20

DOI: https://doi.org/10.1017/S0269888918000188 [Opens in a new window]
Copyright: © Cambridge University Press, 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agrawal, R. & Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, 487–499.Google Scholar

Balac, N., Gaines, D. M. & Fisher, D. 2000. Learning action models for navigation in noisy environments. In ICML Workshop on Machine Learning of Spatial Knowledge.Google Scholar

Bevacqua, G., Cacace, J., Finzi, A. & Lippiello, V. 2015. Mixed-initiative planning and execution for multiple drones in search and rescue missions. In ICAPS, pp. 315–323.Google Scholar

Brafman, R. I. & Domshlak, C. 2008. From one to many: planning for loosely coupled multi-agent systems. In ICAPS,28–35.Google Scholar

Bresina, J. & Morris, P. 2007. Mixed-initiative planning in space mission operations. AI Magazine 28(2), 75.Google Scholar

Cashmore, M., Fox, M., Long, D., Ridder, B. C. & Magazzeni, D. 2016a. Opportunistic planning for increased plan utility. In Proceedings of the 4th ICAPS Workshop on Planning and Robotics (PlanRob 2016), 82–92.Google Scholar

Cashmore, M., Fox, M., Long, D., Ridder, B. C. & Magazzeni, D. 2016b. Strategic planning for autonomous systems over long horizons. In Proceedings of the 4th ICAPS Workshop on Planning and Robotics (PlanRob 2016), 74–81.Google Scholar

Cohen, L., Shimony, S. E. & Weiss, G. 2015. Estimating the probability of meeting a deadline in hierarchical plans. In Computational Logic in Multi-Agent Systems, 243–258.Google Scholar

Cresswell, S. & Gregory, P. 2011. Generalised Domain Model Acquisition from Action Traces. In International Conference on Automated Planning and Scheduling.Google Scholar

Cresswell, S. N., McCluskey, T. L. & West, M. M. 2009. Acquisition of object-centered domain models from planning examples.Google Scholar

Cresswell, S. N., McCluskey, T. L. & West, M. M. 2013. Acquiring planning domain models using LOCM. The Knowledge Engineering Review 28(2), 195–213.Google Scholar

Croonenborghs, T., Ramon, J., Blockeel, H. & Bruynooghe, M. 2007. Online learning and exploiting relational models in reinforcement learning. In International Joint Conference on Artificial Intelligence, 726–731.Google Scholar

De Jonge, F., Roos, N. & Witteveen, C. 2009. Primary and secondary diagnosis of multi-agent plan execution. Autonomous Agents and Multi-Agent Systems 18(2), 267–294.Google Scholar

Deshpande, A., Milch, B., Zettlemoyer, L. S. & Kaelbling, L. 2007. Learning probabilistic relational dynamics for multiple tasks. In Probabilistic, Logical and Relational Learning - A Further Synthesis.Google Scholar

Di Rocco, M., Pecora, F. & Saffiotti, A. 2013. Closed loop configuration planning with time and resources. Planning and Robotics, 36.Google Scholar

Driessens, K., Ramon, J. & Blockeel, H. 2001. Speeding up relational reinforcement learning through the use of an incremental first order decision tree learner. In European Conference on Machine Learning, 97–108.Google Scholar

Džeroski, S., De Raedt, L. & Driessens, K. 2001. Relational reinforcement learning. Machine Learning 43(1–2), 7–52.Google Scholar

Endsley, M. R. & Garland, D. J. 2000. Theoretical underpinnings of situation awareness: a critical review. Situation Awareness Analysis and Measurement 1, 3–32.Google Scholar

Fikes, R. E. & Nilsson, N. J. 1971. STRIPS: a new approach to the application of theorem proving to problem solving. Artificial Intelligence 2(3–4), 189–208.Google Scholar

Ferrer-Mestres, J., Frances, G. & Geffner, H. 2015. Planning with state constraints and its application to combined task and motion planning. In Proceedings of Workshop on Planning and Robotics (PLANROB), 13–22.Google Scholar

García-Martínez, R. & Borrajo, D. 2000. An integrated approach of learning, planning, and execution. Journal of Intelligent and Robotic Systems 29(1), 47–78.Google Scholar

Ghallab, M., Nau, D. & Traverso, P. 2004. Automated Planning: Theory & Practice. Elsevier.Google Scholar

Gil, Y. 1992. Acquiring Domain Knowledge for Planning by Experimentation. PhD thesis, Department of Computer Science, Carnegie-Mellon University.Google Scholar

Guillame-Bert, M. & Crowley, J. L. 2012. Learning temporal association rules on symbolic time sequences. In Asian Conference on Machine Learning, 159–174.Google Scholar

Gregory, P. & Cresswell, S. 2015. Domain model acquisition in the presence of static relations in the LOP system. In ICAPS, 97–105.Google Scholar

Gregory, J., Fink, J., Rogers, J., Gupta, S. & Crowley, J. L. 2016. A risk-based framework for incorporating navigation uncertainty into exploration strategies. In Proceedings of the 4th ICAPS Workshop on Planning and Robotics (PlanRob 2016), 176–183.Google Scholar

Gregory, P. & Lindsay, A. 2016. Domain model acquisition in domains with action costs. In Proceedings of the Twenty-Sixth International Conference on International Conference on Automated Planning and Scheduling,149–157. AAAI Press.Google Scholar

Inoue, K., Ribeiro, T. & Sakama, C. 2014. Learning from interpretation transition. Machine Learning 94(1), 51–79.Google Scholar

Jaidee, U., Muñoz-Avila, H. & Aha, D. W. 2011. Integrated learning for goal-driven autonomy. In IJCAI, 22 (3): 2450.Google Scholar

Jilani, R., Crampton, A., Kitchin, D. E. & Vallati, M. 2014. Automated knowledge engineering tools in planning: state-of-the-art and future challenges. In Knowledge Engineering for Planning and Scheduling.Google Scholar

Jilani, R., Crampton, A., Kitchin, D. & Vallati, M. 2015. ASCoL: a tool for improving automatic planning domain model acquisition. In Congress of the Italian Association for Artificial Intelligence, 438–451. Springer International Publishing.Google Scholar

Jiménez, S., De La Rosa, T., Fernández, S., Fernández, F. & Borrajo, D. 2012. A review of machine learning for automated planning. The Knowledge Engineering Review 27(4), 433–467.Google Scholar

Jiménez, S., Fernández, F. & Borrajo, D. 2008. The PELA architecture: integrating planning and learning to improve execution. In Association for the Advancement of Artificial Intelligence.Google Scholar

Kalech, M. 2012. Diagnosis of coordination failures: a matrix-based approach. Autonomous Agents and Multi-Agent Systems 24(1), 69–103.Google Scholar

Martínez, D., Alenya, G., Torras, C., Ribeiro, T. & Inoue, K. 2016. Learning relational dynamics of stochastic domains for planning. In Proceedings of the 26th International Conference on Automated Planning and Scheduling.Google Scholar

McCluskey, T. L., Cresswell, S. N., Richardson, N. E. & West, M. M. 2009. Automated acquisition of action knowledge.Google Scholar

McCluskey, T. L., Richardson, N. E. & Simpson, R. M. 2002. An interactive method for inducing operator descriptions. In Artificial Intelligence Planning Systems, 121–130.Google Scholar

McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D. & Wilkins, D. 1998. PDDL-the planning domain definition language.Google Scholar

Micalizio, R. & Torasso, P. 2014. Cooperative monitoring to diagnose multiagent plans. Journal of Artificial Intelligence Research 51, 1–70.Google Scholar

Molineaux, M. & Aha, D. W. 2014. Learning unknown event models. In Association for the Advancement of Artificial Intelligence, 395–401.Google Scholar

Molineaux, M., Klenk, M. & Aha, D. W. 2010. Goal-Driven Autonomy in a Navy Strategy Simulation. Knexus Research Corp.Google Scholar

Mourão, K. 2012. Learning action representations using kernel perceptrons.Google Scholar

Mourão, K. 2014. Learning probabilistic planning operators from noisy observations. In Proceedings of the Workshop of the UK Planning and Scheduling Special Interest Group.Google Scholar

Mourão, K., Petrick, R. P. & Steedman, M. 2008. Using kernel perceptrons to learn action effects for planning. In International Conference on Cognitive Systems (CogSys 2008), 45–50.Google Scholar

Mourão, K., Petrick, R. P. & Steedman, M. 2010. Learning action effects in partially observable domains. In European Conference on Artificial Intelligence, 973–974.Google Scholar

Mourão, K., Zettlemoyer, L. S., Petrick, R. & Steedman, M. 2012. Learning STRIPS Operators from Noisy and Incomplete Observations. In Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 614–623.Google Scholar

Muñoz-Avila, H., Aha, D. W., Breslow, L. & Nau, D. 1999. HICAP: an interactive case-based planning architecture and its application to noncombatant evacuation operations. In Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, 870–875.Google Scholar

Muscettola, N., Nayak, P. P., Pell, B. & Williams, B. C. 1998. Remote agent: to boldly go where no AI system has gone before. Artificial Intelligence 103(1–2), 5–47.Google Scholar

Newton, M. A. H. & Levine, J. 2010. Implicit learning of compiled macro-actions for planning. In European Conference on Artificial Intelligence, 323–328.Google Scholar

Newton, M. A. H., Levine, J., Fox, M. & Long, D. 2007. Learning macro-actions for arbitrary planners and domains. In International Conference on Automated Planning and Scheduling, 256–263.Google Scholar

Newton, M. H., Levine, J., Fox, M. & Long, D. 2008. Learning macros that are not captured by given example plans. In Poster Papers at the International Conference on Automated Planning and Scheduling.Google Scholar

Nilsson, N. J. 1984. Shakey the Robot. Sri International.Google Scholar

Ortiz, J., García-Olaya, A. & Borrajo, D. 2013. Using Roni Stern for building planning action models. International Journal of Distributed Sensor Networks 9(6), 942347.Google Scholar

Pan, S. J. & Yang, Q. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1359.Google Scholar

Pasula, H., Zettlemoyer, L. S. & Kaelbling, L. P. 2004. Learning probabilistic relational planning rules. In International Conference on Automated Planning and Scheduling, 73–82.Google Scholar

Pasula, H., Zettlemoyer, L. S. & Kaelbling, L. P. 2007. Learning symbolic models of stochastic domains. In Journal of Artificial Intelligence Research, 309–352.Google Scholar

Pednault, E. P. 1989. ADL: exploring the middle ground between STRIPS and the situation calculus. Kr 89, 324–332.Google Scholar

Pell, B., Gat, E., Keesing, R., Muscettola, N. & Smith, B. 1997. Robust periodic planning and execution for autonomous spacecraft. In IJCAI, 1234–1239.Google Scholar

Quinlan, J. R. 1986. Induction of decision trees. Machine Learning 1(1), 81–106.Google Scholar

Ranasinghe, N. & Shen, W. 2008. Surprise-based learning for developmental robotics. In Learning and Adaptive Behaviors for Robotic Systems, 65–70.Google Scholar

Richardson, M. & Domingos, P. 2006. Markov logic networks. Machine Learning 62(1–2), 107–136.Google Scholar

Rodrigues, C., Gérard, P., Rouveirol, C. & Soldano, H. 2011. Active learning of relational action models. In International Conference on Inductive Logic Programming, 302–316.Google Scholar

Sadohara, K. 2001. Learning of boolean functions using support vector machines. In International Conference on Algorithmic Learning Theory, 106–118.Google Scholar

Safaei, J. & Ghassem-Sani, G. 2007. Incremental learning of planning operators in stochastic domains. In International Conference on Current Trends in Theory and Practice of Computer Science, 644–655.Google Scholar

Sanner, S. 2010. Relational Dynamic Influence Diagram Language (rddl): Language Description. Unpublished Manuscript, Australian National University.Google Scholar

Shah, M., Chrpa, L., Jimoh, F., Kitchin, D., McCluskey, T., Parkinson, S. & Vallati, M. 2013. Knowledge engineering tools in planning: state-of-the-art and future challenges. Knowledge Engineering for Planning and Scheduling 53, 53.Google Scholar

Shen, W. M. 1993. Discovery as autonomous learning from the environment. Machine Learning 12(1–3), 143–165.Google Scholar

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M. & Dieleman, S. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489.Google Scholar

Stern, R. & Brendan, J. 2017. Efficient, safe, and probably approximately complete learning of action models. In IJCAI.Google Scholar

Strenzke, R. & Schulte, A. 2011. The MMP: A mixed-initiative mission planning system for the multi-aircraft domain. In Proceeding of the International Conference on Automated Planning and Scheduling, 74–82.Google Scholar

Stulp, F., Fedrizzi, A., Mösenlechner, L. & Beetz, M. 2012. Learning and reasoning with action-related places for robust mobile manipulation. Journal of Artificial Intelligence Research 43, 1–42.Google Scholar

Sutton, R. S. & Barto, A. G. 1998. Reinforcement Learning: An Introduction. MIT Press.Google Scholar

Walsh, T. J. & Littman, M. L. 2008. Efficient learning of action schemas and web-service descriptions. In Association for the Advancement of Artificial Intelligence, 714–719.Google Scholar

Wang, X. 1996. Learning Planning Operators by Observation and Practice. Doctoral dissertation, Carnegie Mellon University.Google Scholar

Weber, B. G., Mateas, M. & Jhala, A. 2012. Learning from demonstration for goal-driven autonomy. In Association for the Advancement of Artificial Intelligence.Google Scholar

Winograd, T. 1972. Understanding natural language. Cognitive Psychology 3(1), 1–191.Google Scholar

Yang, Q., Wu, K. & Jiang, Y. 2007. Learning action models from plan examples using weighted MAX-SAT. Artificial Intelligence, 171(2–3), 107–143.Google Scholar

Yoon, S. & Kambhampati, S. 2007. Towards model-lite planning: a proposal for learning and planning with incomplete domain models. In ICAPS 2007 Workshop on Artificial Intelligence Planning and Learning.Google Scholar

Younes, H. L., Littman, M. L., Weissman, D. & Asmuth, J. 2005. The first probabilistic track of the international planning competition. Journal of Artificial Intelligence Research 24, 851–887.Google Scholar

Zettlemoyer, L. S., Pasula, H. & Kaelbling, L. P. 2005. Learning planning rules in noisy stochastic worlds. In Association for the Advancement of Artificial Intelligence, 911–918.Google Scholar

Zhang, Y., Sreedharan, S. & Kambhampati, S. 2015. Capability models and their applications in planning. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 1151–1159.Google Scholar

Zhuo, H. H. & Kambhampati, S. 2013. Action-model acquisition from noisy plan traces. In International Joint Conference on Artificial Intelligence.Google Scholar

Zhuo, H. H., Muñoz-Avila, H. & Yang, Q. 2011. Learning action models for multi-agent planning. In The 10th International Conference on Autonomous Agents and Multiagent Systems, 1, 217–224.Google Scholar

Zhuo, H. H., Nguyen, T. A. & Kambhampati, S. 2013. Refining incomplete planning domain models through plan traces. In International Joint Conference on Artificial Intelligence.Google Scholar

Zhuo, H. H. & Yang, Q. 2014. Action-model acquisition for planning via transfer learning. Artificial Intelligence 212, 80–103.Google Scholar

Zhuo, H. H., Yang, Q., Hu, D. H. & Li, L. 2010. Learning complex action models with quantifiers and logical implications. Artificial Intelligence 174(18), 1540–1569.Google Scholar

Zimmerman, T. & Kambhampati, S. 2003. Learning-assisted automated planning: looking back, taking stock, going forward. AI Magazine 24(2), 73.Google Scholar

Article contents

A review of learning planning action models

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests