Hostname: page-component-848d4c4894-wzw2p Total loading time: 0 Render date: 2024-06-06T10:24:35.842Z Has data issue: false hasContentIssue false

Automatic Differentiation in Prolog

Published online by Cambridge University Press:  06 July 2023

TOM SCHRIJVERS
Affiliation:
KU Leuven, Leuven, Belgium (e-mails: tom.schrijvers@kuleuven.be, birthe.vandenberg@kuleuven.be)
BIRTHE VAN DEN BERG
Affiliation:
KU Leuven, Leuven, Belgium (e-mails: tom.schrijvers@kuleuven.be, birthe.vandenberg@kuleuven.be)
FABRIZIO RIGUZZI
Affiliation:
Università degli Studi di Ferrara, Ferrara, Italy (e-mail: fabrizio.riguzzi@unife.it)

Abstract

Automatic differentiation (AD) is a range of algorithms to compute the numeric value of a function’s (partial) derivative, where the function is typically given as a computer program or abstract syntax tree. AD has become immensely popular as part of many learning algorithms, notably for neural networks. This paper uses Prolog to systematically derive gradient-based forward- and reverse-mode AD variants from a simple executable specification: evaluation of the symbolic derivative. Along the way we demonstrate that several Prolog features (DCGs, co-routines) contribute to the succinct formulation of the algorithm. We also discuss two applications in probabilistic programming that are enabled by our Prolog algorithms. The first is parameter learning for the Sum-Product Loop Language and the second consists of both parameter learning and variational inference for probabilistic logic programming.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

This project is partly funded by the Flemish Fund for Scientific Research (FWO).

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y. and Zheng, X. 2015. TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.Google Scholar
Abdallah, S. 2017. Automatic differentiation using constraint handling rules in Prolog. URL: https://arxiv.org/abs/1706.00231.Google Scholar
Bradbury, J., Frostig, R., Hawkins, P., Johnson, M. J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., VanderPlas, J., Wanderman-Milne, S. and Zhang, Q. 2018. JAX: Composable transformations of Python+NumPy programs. URL: https://github.com/google/jax.Google Scholar
Bruynooghe, M. 2004. Enhancing a search algorithm to perform intelligent backtracking. Theory and Practice of Logic Programming 4, 3, 371380.10.1017/S1471068403001893CrossRefGoogle Scholar
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P. and Riddell, A. 2017. Stan: A probabilistic programming language. Journal of Statistical Software 76, 1, 132.Google Scholar
Carpenter, B., Hoffman, M. D., Brubaker, M., Lee, D., Li, P. and Betancourt, M. 2015. The stan math library: Reverse-mode automatic differentiation in c++. arXiv preprint arXiv:1509.07164.Google Scholar
Clocksin, W. F. and Mellish, C. S. 2003. Programming in Prolog, 5th ed. Springer, Berlin.CrossRefGoogle Scholar
Dash, S., Kaddar, Y., Paquet, H. and Staton, S. 2023. Affine monads and lazy structures for bayesian programming. Proceedings of the ACM on Programming Languages 7, POPL, 1338–1368.Google Scholar
Debray, S. K. 1988. Unfold/fold transformations and loop optimization of logic programs. In Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation. PLDI’88. New York, NY, USA, 297–307.Google Scholar
Howe, J. M. and King, A. 2012. A pearl on SAT and SMT solving in prolog. Theoretical Computer Science 435, 4355.10.1016/j.tcs.2012.02.024CrossRefGoogle Scholar
Islam, M. A., Ramakrishnan, C. and Ramakrishnan, I. 2012. Inference in probabilistic logic programs with continuous random variables. Theory and Practice of Logic Programming 12, 505523.CrossRefGoogle Scholar
Kimmig, A., Van den Broeck, G. and De Raedt, L. 2011. An algebraic prolog for reasoning about possible worlds. In Twenty-Fifth AAAI Conference on Artificial Intelligence.10.1609/aaai.v25i1.7852CrossRefGoogle Scholar
Kowalski, R. A. 1979. Algorithm = logic + control. Communications of the ACM 22, 7, 424436.CrossRefGoogle Scholar
Krawiec, F., Jones, S. P., Krishnaswami, N., Ellis, T., Eisenberg, R. A. and Fitzgibbon, A. W. 2022. Provably correct, asymptotically efficient, higher-order reverse-mode automatic differentiation. Proceedings of the ACM on Programming Languages 6, POPL, 1–30.Google Scholar
Manhaeve, R., Dumancic, S., Kimmig, A., Demeester, T. and De Raedt, L. 2018. Deepproblog: Neural probabilistic logic programming. In Advances in Neural Information Processing Systems 31.Google Scholar
Nagata, M. 1962. Local Rings. Wiley Interscience.Google Scholar
Nguyen, M., Perera, R., Wang, M. and Wu, N. 2022. Modular probabilistic models via algebraic effects. Proceedings of the ACM on Programming Languages 6, ICFP, 381–410.Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J. and Chintala, S. 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024–8035.Google Scholar
Pearlmutter, B. A. and Siskind, J. M. 2008. Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator. ACM Transactions on Programming Languages and Systems 30, 2, 7:1–7:36.Google Scholar
Pfanschilling, V., Shindo, H., Dhami, D. S. and Kersting, K. 2022. Sum-product loop programming: From probabilistic circuits to loop programming. In Proceedings of KR.10.24963/kr.2022/47CrossRefGoogle Scholar
Riguzzi, F. 2018. Foundations of Probabilistic Logic Programming: Languages, Semantics, Inference and Learning. River Publishers, Gistrup, Denmark.Google Scholar
Robbins, E., King, A. and Howe, J. M. 2021. Backjumping is exception handling. Theory and Practice of Logic Programming 21, 2, 125144.10.1017/S1471068420000435CrossRefGoogle Scholar
Sato, T. and Kameya, Y. 1997. PRISM: A language for symbolic-statistical modeling. In IJCAI, Vol. 97, 13301339.Google Scholar
Schrijvers, T. and Frühwirth, T. W. 2006. Optimal union-find in constraint handling rules. Theory and Practice of Logic Programming 6, 1–2, 213224.CrossRefGoogle Scholar
Smeding, T. and Vákár, M. 2022. Efficient dual-numbers reverse ad via well-known program transformations. URL: https://arxiv.org/abs/2207.03418.CrossRefGoogle Scholar
Szirmay-Kalos, L. 2021. Higher order automatic differentiation with dual numbers. Periodica Polytechnica Electrical Engineering and Computer Science 65, 1, 110.Google Scholar
van den Berg, B., Schrijvers, T., McKinna, J. and Vandenbroucke, A. 2022. Forward- or reverse-mode automatic differentiation: What’s the difference? CoRR abs/2212.11088.10.2139/ssrn.4358090CrossRefGoogle Scholar
Vandecasteele, H. and Janssens, G. 2003. An open ended tree. Theory and Practice of Logic Programming 3, 3, 377385.CrossRefGoogle Scholar
Wang, F., Zheng, D., Decker, J. M., Wu, X., Essertel, G. M. and Rompf, T. 2019. Demystifying differentiable programming: Shift/reset the penultimate backpropagator. Proceedings of the ACM on Programming Languages 3, ICFP, 96:1–96:31.Google Scholar
Wengert, R. E. 1964. A simple automatic derivative evaluation program. Communications of the ACM 7, 8, 463464.CrossRefGoogle Scholar
Wingate, D. and Weber, T. 2013. Automated variational inference in probabilistic programming. URL: https://doi.org/10.48550/arXiv.1301.1299.CrossRefGoogle Scholar