Hostname: page-component-848d4c4894-wg55d Total loading time: 0 Render date: 2024-06-02T02:06:31.979Z Has data issue: false hasContentIssue false

Expression unleashed in artificial intelligence

Published online by Cambridge University Press:  17 February 2023

Ekaterina I. Tolstaya
Affiliation:
Waymo LLC, New York, NY, USA eig@waymo.com http://katetolstaya.com/
Abhinav Gupta
Affiliation:
MILA, Montreal, QC H2S 3H1, Canada abhinavg@nyu.edu https://mila.quebec/en/person/abhinav-gupta/
Edward Hughes
Affiliation:
DeepMind, London, UK. edwardhughes@google.com http://edwardhughes.io

Abstract

The problem of generating generally capable agents is an important frontier in artificial intelligence (AI) research. Such agents may demonstrate open-ended, versatile, and diverse modes of expression, similar to humans. We interpret the work of Heintz & Scott-Phillips as a minimal sufficient set of socio-cognitive biases for the emergence of generally expressive AI, separate yet complementary to existing algorithms.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramson, J., Ahuja, A., Barr, I., Brussee, A., Carnevale, F., Cassin, M., … Zhu, R. (2020). Imitating interactive intelligence. arXiv preprint, arXiv:2012.05672.Google Scholar
Anastassacos, N., Hailes, S., & Musolesi, M. (2020). Partner selection for the emergence of cooperation in multi-agent systems using reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 05, pp. 7047–7054).CrossRefGoogle Scholar
Baker, B., Kanitscheider, I., Markov, T., Wu, Y., Powell, G., McGrew, B., & Mordatch, I. (2019). Emergent tool use from multi-agent autocurricula. arXiv preprint, arXiv:1909.07528.Google Scholar
Bard, N., Foerster, J. N., Chandar, S., Burch, N., Lanctot, M., Song, H. F., … Bowling, M. (2020). The Hanabi challenge: A new frontier for ai research. Artificial Intelligence, 280, 103216.CrossRefGoogle Scholar
Bouchacourt, D., & Baroni, M. (2019). Miss Tools and Mr Fruit: Emergent communication in agents learning about object affordances. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy (pp. 3909–3918). Association for Computational Linguistics.CrossRefGoogle Scholar
Brown, N., & Sandholm, T. (2018). Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. Science, 359(6374), 418424.CrossRefGoogle ScholarPubMed
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, Article 159, 18771901.Google Scholar
Carroll, M., Shah, R., Ho, M. K., Griffiths, T., Seshia, S., Abbeel, P., & Dragan, A. (2019). On the utility of learning about humans for human-ai coordination. Advances in Neural Information Processing Systems, 32, Article 465, 51745185.Google Scholar
Chollet, F. (2019). On the measure of intelligence. arXiv preprint, arXiv:1911.01547.Google Scholar
Cultural General Intelligence, T., Bhoopchand, A., Brownfield, B., Collister, A., Lago, A. D., Edwards, A., … Zhang, L. M. (2022). Learning robust real-time cultural transmission without human data. ArXiv, abs/2203.00715.Google Scholar
Dafoe, A., Hughes, E., Bachrach, Y., Collins, T., McKee, K. R., Leibo, J. Z., … Graepel, T. (2020). Open problems in cooperative AI. arXiv preprint, arXiv:2012.08630.Google Scholar
Dolgov, D. (2021). How we've built the world's most experienced urban driver. Waypoint, the official Waymo blog, https://blog.waymo.com/2021/08/MostExperiencedUrbanDriver.html.Google Scholar
Eccles, T., Bachrach, Y., Lever, G., Lazaridou, A., & Graepel, T. (2019). Biases for emergent communication in multi-agent reinforcement learning. Advances In Neural Information Processing Systems, 32, Article 1176, 1312113131.Google Scholar
Foerster, J., Assael, I. A., De Freitas, N., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. Advances in neural information processing systems, 29, 21452153.Google Scholar
Foerster, J. N., Chen, R. Y., Al-Shedivat, M., Whiteson, S., Abbeel, P., & Mordatch, I. (2017). Learning with opponent-learning awareness. In Proceedings of the International Conference on Autonomous Agents and MultiAgent Systems (AAMAS '18). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (pp. 122–130).Google Scholar
Foerster, J., Song, F., Hughes, E., Burch, N., Dunning, I., Whiteson, S., … Bowling, M. (2019). Bayesian action decoder for deep multi-agent reinforcement learning. In International Conference on Machine Learning (pp. 1942–1951). PMLR.Google Scholar
Gordon, R. M. (1986). Folk psychology as simulation. Mind & Language, 1(2), 158171.CrossRefGoogle Scholar
Grigorescu, S., Trasnea, B., Cocias, T., & Macesanu, G. (2020). A survey of deep learning techniques for autonomous driving. Journal of Field Robotics, 37(3), 362386.CrossRefGoogle Scholar
Gupta, A., Lanctot, M., & Lazaridou, A. (2021). Dynamic population-based meta-learning for multi-agent communication with natural language. Advances in Neural Information Processing Systems, 34, 1689916912.Google Scholar
Habibovic, A., Lundgren, V. M., Andersson, J., Klingegård, M., Lagström, T., Sirkka, A., … Larsson, P. (2018). Communicating intent of automated vehicles to pedestrians. Frontiers in Psychology, 1336.CrossRefGoogle ScholarPubMed
Heal, J. (1986). Replication and functionalism. Language, Mind, and Logic, 1, 135150.Google Scholar
Henrich, J. (2015). The secret of our success. In The Secret of Our Success. Princeton University Press.CrossRefGoogle Scholar
Hu, H., & Foerster, J. N. (2020). Simplified action decoder for deep multi-agent reinforcement learning. International Conference on Learning Representations.Google Scholar
Hu, H., Lerer, A., Cui, B., Pineda, L., Brown, N., & Foerster, J. N. (2021). Off-belief learning. Proceedings of Machine Learning Research, 139, 43694379.Google Scholar
Hutter, M. (2000). A theory of universal artificial intelligence based on algorithmic complexity. ArXiv, cs/0004001.Google Scholar
Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., & Hutter, M. (2019). Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26), eaau5872.CrossRefGoogle ScholarPubMed
Jaderberg, M., Czarnecki, W. M., Dunning, I., Marris, L., Lever, G., Castañeda, A. G., … Graepel, T. (2019). Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 364(6443), 859865.CrossRefGoogle ScholarPubMed
Jaques, N., McCleary, J., Engel, J., Ha, D., Bertsch, F., Eck, D., & Picard, R. (2018). Learning via social awareness: Improving a deep generative sketching model with facial feedback. Proceedings of Machine Learning Research 86, 1–9, 2nd International Workshop on Artificial Intelligence in Affective Computing.Google Scholar
Jaques, N., Lazaridou, A., Hughes, E., Gülçehre, Ç., Ortega, P.A., Strouse, D., … Freitas, N.D. (2019). Social influence as intrinsic motivation for multi-agent deep reinforcement learning. Proceedings of the 36th International Conference on Machine Learning, PMLR (Vol. 97, pp. 3040–3049).Google Scholar
Kang, Y., Wang, T., & de Melo, G. (2020). Incorporating pragmatic reasoning communication into emergent language. Advances in Neural Information Processing Systems, 33, 1034810359.Google Scholar
Kirby, S., Cornish, H., & Smith, K. (2018). Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language. Proceedings of the National Academy of Sciences, 105, 31.Google Scholar
Kottur, S., Moura, J., Lee, S., & Batra, D. (2017). Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2962–2967). Association for Computational Linguistics.CrossRefGoogle Scholar
Lazaridou, A., & Baroni, M. (2020). Emergent multi-agent communication in the deep learning era. ArXiv, abs/2006.02419.Google Scholar
Lazaridou, A., Peysakhovich, A., & Baroni, M. (2017). Multi-agent cooperation and the emergence of (natural) language. International Conference on Learning Representations.Google Scholar
Legg, S., & Hutter, M. (2007). Universal Intelligence: A Definition of Machine Intelligence. Minds & Machines 17, 391444.CrossRefGoogle Scholar
Lowe, R., Gupta, A., Foerster, J. N., Kiela, D., & Pineau, J. (2020). On the interaction between supervision and self-play in emergent communication. International Conference on Learning Representations.Google Scholar
Macpherson, T., Churchland, A., Sejnowski, T., DiCarlo, J., Kamitani, Y., Takahashi, H., & Hikida, T. (2021). Natural and artificial intelligence: A brief introduction to the interplay between AI and neuroscience research. Neural Networks, 144, 603613.CrossRefGoogle Scholar
Mandhane, A., Zhernov, A., Rauh, M., Gu, C., Wang, M., Xue, F., … Mann, T. A. (2022). MuZero with Self-competition for Rate Control in VP9 Video Compression. ArXiv, abs/2202.06626.Google Scholar
Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., … Welling, J. (2018). Never-ending learning. Communications of the ACM, 61(5), 103115.CrossRefGoogle Scholar
Moravčík, M., Schmid, M., Burch, N., Lisý, V., Morrill, D., Bard, N., & Bowling, M. (2017). DeepStack: Expert-level artificial intelligence in heads-up no-limit poker. Science, 356(6337), 508513.CrossRefGoogle ScholarPubMed
Moreno, P., Hughes, E., McKee, K. R., Pires, B. Á., & Weber, T. (2021). Neural recursive belief states in multi-agent reinforcement learning. ArXiv, abs/2102.02274.Google Scholar
Pandia, L., Cong, Y., & Ettinger, A. (2021). Pragmatic competence of pre-trained language models through the lens of discourse connectives. In Proceedings of the 25th Conference on Computational Natural Language Learning (pp. 367–379). Association for Computational Linguistics.CrossRefGoogle Scholar
Rabinowitz, N., Perbet, F., Song, F., Zhang, C., Eslami, S. M. A. & Botvinick, M. (2018). Machine Theory of Mind. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research, 80, 42184227.Google Scholar
Resnick, C., Gupta, A., Foerster, J., Dai, A. M., & Cho, K. (2020). Capacity, bandwidth, and compositionality in emergent language learning. International Conference on Autonomous Agents and Multiagent Systems. https://doi.org/10.48550/arXiv.1910.11424CrossRefGoogle Scholar
Sadigh, D., Sastry, S., Seshia, A. S., & Dragan, A. D. (2016). Planning for autonomous cars that leverage effects on human actions. Robotics: Science and Systems XII.Google Scholar
Savage, N. (2019). How AI and neuroscience drive each other forwards. Nature, 571(7766), S15S17.CrossRefGoogle ScholarPubMed
Scott-Phillips, T. C. (2014). Speaking our minds: Why human communication is different, and how language evolved to make it special. Macmillan International Higher Education.Google Scholar
Scott-Phillips, T. C., Kirby, S., & Ritchie, G. R. (2009). Signalling signalhood and the emergence of communication. Cognition, 113(2), 226233.CrossRefGoogle ScholarPubMed
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., … Hassabis, D. (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419), 11401144.CrossRefGoogle ScholarPubMed
Strouse, D. J., McKee, K. R., Botvinick, M., Hughes, E., & Everett, R. (2021). Collaborating with humans without human data. Advances in Neural Information Processing Systems, 34, 1450214515.Google Scholar
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press.Google Scholar
Team, Open Ended Learning, et al. (2021). Open-ended learning leads to generally capable agents. arXiv preprint, arXiv:2107.12808.Google Scholar
Tolstaya, E., Mahjourian, R., Downey, C., Vadarajan, B., Sapp, B., & Anguelov, D. (2021). Identifying driver interactions via conditional behavior prediction. In 2021 IEEE International Conference on Robotics and Automation (ICRA) (pp. 3473–3479). IEEE Press.CrossRefGoogle Scholar
Turing, A. M. (1950). Computing machinery and intelligence. Mind, LIX(236), 433460.CrossRefGoogle Scholar
Yang, J., Li, A., Farajtabar, M., Sunehag, P., Hughes, E., & Zha, H. (2020). Learning to incentivize other learning agents. Advances in Neural Information Processing Systems, 33, 1275.Google Scholar