Expression unleashed in artificial intelligence

Ekaterina I. Tolstaya; Abhinav Gupta; Edward Hughes

doi:10.1017/S0140525X22000814

Expression unleashed in artificial intelligence

Published online by Cambridge University Press: 17 February 2023

Ekaterina I. Tolstaya ,

Abhinav Gupta and

Edward Hughes

Show author details

Ekaterina I. Tolstaya: Affiliation:
Waymo LLC, New York, NY, USA eig@waymo.com http://katetolstaya.com/
Abhinav Gupta: Affiliation:
MILA, Montreal, QC H2S 3H1, Canada abhinavg@nyu.edu https://mila.quebec/en/person/abhinav-gupta/
Edward Hughes: Affiliation:
DeepMind, London, UK. edwardhughes@google.com http://edwardhughes.io

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The problem of generating generally capable agents is an important frontier in artificial intelligence (AI) research. Such agents may demonstrate open-ended, versatile, and diverse modes of expression, similar to humans. We interpret the work of Heintz & Scott-Phillips as a minimal sufficient set of socio-cognitive biases for the emergence of generally expressive AI, separate yet complementary to existing algorithms.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 46 , 2023 , e16

DOI: https://doi.org/10.1017/S0140525X22000814 [Opens in a new window]
Copyright: Copyright © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramson, J., Ahuja, A., Barr, I., Brussee, A., Carnevale, F., Cassin, M., … Zhu, R. (2020). Imitating interactive intelligence. arXiv preprint, arXiv:2012.05672.Google Scholar

Anastassacos, N., Hailes, S., & Musolesi, M. (2020). Partner selection for the emergence of cooperation in multi-agent systems using reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 05, pp. 7047–7054).CrossRef Google Scholar

Baker, B., Kanitscheider, I., Markov, T., Wu, Y., Powell, G., McGrew, B., & Mordatch, I. (2019). Emergent tool use from multi-agent autocurricula. arXiv preprint, arXiv:1909.07528.Google Scholar

Bard, N., Foerster, J. N., Chandar, S., Burch, N., Lanctot, M., Song, H. F., … Bowling, M. (2020). The Hanabi challenge: A new frontier for ai research. Artificial Intelligence, 280, 103216.CrossRef Google Scholar

Bouchacourt, D., & Baroni, M. (2019). Miss Tools and Mr Fruit: Emergent communication in agents learning about object affordances. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy (pp. 3909–3918). Association for Computational Linguistics.CrossRef Google Scholar

Brown, N., & Sandholm, T. (2018). Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. Science, 359(6374), 418–424.CrossRef Google Scholar PubMed

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, Article 159, 1877–1901.Google Scholar

Carroll, M., Shah, R., Ho, M. K., Griffiths, T., Seshia, S., Abbeel, P., & Dragan, A. (2019). On the utility of learning about humans for human-ai coordination. Advances in Neural Information Processing Systems, 32, Article 465, 5174–5185.Google Scholar

Chollet, F. (2019). On the measure of intelligence. arXiv preprint, arXiv:1911.01547.Google Scholar

Cultural General Intelligence, T., Bhoopchand, A., Brownfield, B., Collister, A., Lago, A. D., Edwards, A., … Zhang, L. M. (2022). Learning robust real-time cultural transmission without human data. ArXiv, abs/2203.00715.Google Scholar

Dafoe, A., Hughes, E., Bachrach, Y., Collins, T., McKee, K. R., Leibo, J. Z., … Graepel, T. (2020). Open problems in cooperative AI. arXiv preprint, arXiv:2012.08630.Google Scholar

Dolgov, D. (2021). How we've built the world's most experienced urban driver. Waypoint, the official Waymo blog, https://blog.waymo.com/2021/08/MostExperiencedUrbanDriver.html.Google Scholar

Eccles, T., Bachrach, Y., Lever, G., Lazaridou, A., & Graepel, T. (2019). Biases for emergent communication in multi-agent reinforcement learning. Advances In Neural Information Processing Systems, 32, Article 1176, 13121–13131.Google Scholar

Foerster, J., Assael, I. A., De Freitas, N., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. Advances in neural information processing systems, 29, 2145–2153.Google Scholar

Foerster, J. N., Chen, R. Y., Al-Shedivat, M., Whiteson, S., Abbeel, P., & Mordatch, I. (2017). Learning with opponent-learning awareness. In Proceedings of the International Conference on Autonomous Agents and MultiAgent Systems (AAMAS '18). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (pp. 122–130).Google Scholar

Foerster, J., Song, F., Hughes, E., Burch, N., Dunning, I., Whiteson, S., … Bowling, M. (2019). Bayesian action decoder for deep multi-agent reinforcement learning. In International Conference on Machine Learning (pp. 1942–1951). PMLR.Google Scholar

Gordon, R. M. (1986). Folk psychology as simulation. Mind & Language, 1(2), 158–171.CrossRef Google Scholar

Grigorescu, S., Trasnea, B., Cocias, T., & Macesanu, G. (2020). A survey of deep learning techniques for autonomous driving. Journal of Field Robotics, 37(3), 362–386.CrossRef Google Scholar

Gupta, A., Lanctot, M., & Lazaridou, A. (2021). Dynamic population-based meta-learning for multi-agent communication with natural language. Advances in Neural Information Processing Systems, 34, 16899–16912.Google Scholar

Habibovic, A., Lundgren, V. M., Andersson, J., Klingegård, M., Lagström, T., Sirkka, A., … Larsson, P. (2018). Communicating intent of automated vehicles to pedestrians. Frontiers in Psychology, 1336.CrossRef Google Scholar PubMed

Heal, J. (1986). Replication and functionalism. Language, Mind, and Logic, 1, 135–150.Google Scholar

Henrich, J. (2015). The secret of our success. In The Secret of Our Success. Princeton University Press.CrossRef Google Scholar

Hu, H., & Foerster, J. N. (2020). Simplified action decoder for deep multi-agent reinforcement learning. International Conference on Learning Representations.Google Scholar

Hu, H., Lerer, A., Cui, B., Pineda, L., Brown, N., & Foerster, J. N. (2021). Off-belief learning. Proceedings of Machine Learning Research, 139, 4369–4379.Google Scholar

Hutter, M. (2000). A theory of universal artificial intelligence based on algorithmic complexity. ArXiv, cs/0004001.Google Scholar

Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., & Hutter, M. (2019). Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26), eaau5872.CrossRef Google Scholar PubMed

Jaderberg, M., Czarnecki, W. M., Dunning, I., Marris, L., Lever, G., Castañeda, A. G., … Graepel, T. (2019). Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 364(6443), 859–865.CrossRef Google Scholar PubMed

Jaques, N., McCleary, J., Engel, J., Ha, D., Bertsch, F., Eck, D., & Picard, R. (2018). Learning via social awareness: Improving a deep generative sketching model with facial feedback. Proceedings of Machine Learning Research 86, 1–9, 2nd International Workshop on Artificial Intelligence in Affective Computing.Google Scholar

Jaques, N., Lazaridou, A., Hughes, E., Gülçehre, Ç., Ortega, P.A., Strouse, D., … Freitas, N.D. (2019). Social influence as intrinsic motivation for multi-agent deep reinforcement learning. Proceedings of the 36th International Conference on Machine Learning, PMLR (Vol. 97, pp. 3040–3049).Google Scholar

Kang, Y., Wang, T., & de Melo, G. (2020). Incorporating pragmatic reasoning communication into emergent language. Advances in Neural Information Processing Systems, 33, 10348–10359.Google Scholar

Kirby, S., Cornish, H., & Smith, K. (2018). Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language. Proceedings of the National Academy of Sciences, 105, 31.Google Scholar

Kottur, S., Moura, J., Lee, S., & Batra, D. (2017). Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2962–2967). Association for Computational Linguistics.CrossRef Google Scholar

Lazaridou, A., & Baroni, M. (2020). Emergent multi-agent communication in the deep learning era. ArXiv, abs/2006.02419.Google Scholar

Lazaridou, A., Peysakhovich, A., & Baroni, M. (2017). Multi-agent cooperation and the emergence of (natural) language. International Conference on Learning Representations.Google Scholar

Legg, S., & Hutter, M. (2007). Universal Intelligence: A Definition of Machine Intelligence. Minds & Machines 17, 391–444.CrossRef Google Scholar

Lowe, R., Gupta, A., Foerster, J. N., Kiela, D., & Pineau, J. (2020). On the interaction between supervision and self-play in emergent communication. International Conference on Learning Representations.Google Scholar

Macpherson, T., Churchland, A., Sejnowski, T., DiCarlo, J., Kamitani, Y., Takahashi, H., & Hikida, T. (2021). Natural and artificial intelligence: A brief introduction to the interplay between AI and neuroscience research. Neural Networks, 144, 603–613.CrossRef Google Scholar

Mandhane, A., Zhernov, A., Rauh, M., Gu, C., Wang, M., Xue, F., … Mann, T. A. (2022). MuZero with Self-competition for Rate Control in VP9 Video Compression. ArXiv, abs/2202.06626.Google Scholar

Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., … Welling, J. (2018). Never-ending learning. Communications of the ACM, 61(5), 103–115.CrossRef Google Scholar

Moravčík, M., Schmid, M., Burch, N., Lisý, V., Morrill, D., Bard, N., & Bowling, M. (2017). DeepStack: Expert-level artificial intelligence in heads-up no-limit poker. Science, 356(6337), 508–513.CrossRef Google Scholar PubMed

Moreno, P., Hughes, E., McKee, K. R., Pires, B. Á., & Weber, T. (2021). Neural recursive belief states in multi-agent reinforcement learning. ArXiv, abs/2102.02274.Google Scholar

Pandia, L., Cong, Y., & Ettinger, A. (2021). Pragmatic competence of pre-trained language models through the lens of discourse connectives. In Proceedings of the 25th Conference on Computational Natural Language Learning (pp. 367–379). Association for Computational Linguistics.CrossRef Google Scholar

Rabinowitz, N., Perbet, F., Song, F., Zhang, C., Eslami, S. M. A. & Botvinick, M. (2018). Machine Theory of Mind. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research, 80, 4218–4227.Google Scholar

Resnick, C., Gupta, A., Foerster, J., Dai, A. M., & Cho, K. (2020). Capacity, bandwidth, and compositionality in emergent language learning. International Conference on Autonomous Agents and Multiagent Systems. https://doi.org/10.48550/arXiv.1910.11424 CrossRef Google Scholar

Sadigh, D., Sastry, S., Seshia, A. S., & Dragan, A. D. (2016). Planning for autonomous cars that leverage effects on human actions. Robotics: Science and Systems XII.Google Scholar

Savage, N. (2019). How AI and neuroscience drive each other forwards. Nature, 571(7766), S15–S17.CrossRef Google Scholar PubMed

Scott-Phillips, T. C. (2014). Speaking our minds: Why human communication is different, and how language evolved to make it special. Macmillan International Higher Education.Google Scholar

Scott-Phillips, T. C., Kirby, S., & Ritchie, G. R. (2009). Signalling signalhood and the emergence of communication. Cognition, 113(2), 226–233.CrossRef Google Scholar PubMed

Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., … Hassabis, D. (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419), 1140–1144.CrossRef Google Scholar PubMed

Strouse, D. J., McKee, K. R., Botvinick, M., Hughes, E., & Everett, R. (2021). Collaborating with humans without human data. Advances in Neural Information Processing Systems, 34, 14502–14515.Google Scholar

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press.Google Scholar

Team, Open Ended Learning, et al. (2021). Open-ended learning leads to generally capable agents. arXiv preprint, arXiv:2107.12808.Google Scholar

Waymo (2021). Common Waymo One Questions. https://blog.waymo.com/2021/09/common-waymo-one-questions.html Google Scholar

Tolstaya, E., Mahjourian, R., Downey, C., Vadarajan, B., Sapp, B., & Anguelov, D. (2021). Identifying driver interactions via conditional behavior prediction. In 2021 IEEE International Conference on Robotics and Automation (ICRA) (pp. 3473–3479). IEEE Press.CrossRef Google Scholar

Turing, A. M. (1950). Computing machinery and intelligence. Mind, LIX(236), 433–460.CrossRef Google Scholar

Yang, J., Li, A., Farajtabar, M., Sunehag, P., Hughes, E., & Zha, H. (2020). Learning to incentivize other learning agents. Advances in Neural Information Processing Systems, 33, 1275.Google Scholar