We postulate that the hard problem in (natural or artificial) intelligence is the question of “what to learn?”. At the fundamental level, this meta-question is resolved in nature by the evolutionary process. The question of “how to learn?”, which is the focus of the meta-learning framework that is presented in the target article, is not especially easy as well, but it can be captured by devising specific training structures and relevant optimization tasks. In general, it requires to specify the “search space” of possible learning strategies. The hard problem of learning, however, is the identification of the learning task itself (i.e., what, and if, to learn) (Niv, Reference Niv2019). For instance, a real-life learner observing several specimens of some unknown insect species (following the example in the target article) must first somehow realize that she is required to evaluate the average length of that species, before she begins to tune her evaluation strategies. This is indeed a different meta-task than presented by Binz et al., but its solution is mandatory for any artificial (somewhat-)general intelligence, and it is regularly handled by the brain (Roli, Jaeger, & Kauffman, Reference Roli, Jaeger and Kauffman2022).
In the quest to devise domain-general learning models, Binz et al. correctly identify the need for diverse (and maybe realistic) training sets. Training a model on many different tasks can achieve high performance in all of them, and maybe even in unrelated, but similar, tasks. Yet, the model will always be constrained by the task-space spanned by its training sets. The major challenge does not lie in amplifying the dimensionality, or variability, of the learned problem, but rather in determining the appropriate objective function. Here, we may be inspired by the observation that biological brains have in general not evolved for their ability to solve a specific task, but, rather, are shaped by the overall success of the organism. On the one hand, evolutionary success obscures the objective of each specific task, since it depends on long-term benefits that are not always clearly related to short-term behavior. On the other hand, evolutionary success is a broader optimization challenge. A generic model that can both solve a maze and evaluate the average length of a newly identified insect, without being trained specifically on these tasks, must solve the hard problem of what-to-learn in a given context. To build a model that addresses this challenge we cannot handcraft the utility function (or error measurements) of each task separately. The meta-learning requirement thus becomes to learn how to identify the utility in learning, or in performing, each of the given tasks, and more broadly, to identify the task itself. Thus, it is constraining to use training sets and error functions that provide the learner with “correct” answers or feedback for each task separately, as is typically done in supervised, semi-supervised, and reinforcement machine learning. The biological brain is overall domain-general since it is not guided by a task-specific “utility function.” Domain-specific processes, such as, maybe, those suggested to process language (Fedorenko & Blank, Reference Fedorenko and Blank2020), demonstrate cases in which natural selection narrowed or “optimized” the task of finding what-to-learn. Other indications may include modularity (Ellefsen, Mouret, & Clune, Reference Ellefsen, Mouret and Clune2015; Sporns & Betzel, Reference Sporns and Betzel2016), alongside sensory adaptations (Warrant, Reference Warrant2016), attention biases (Niv et al., Reference Niv, Daniel, Geana, Gershman, Leong, Radulescu and Wilson2015), and data acquisition mechanisms (Lotem & Halpern, Reference Lotem and Halpern2012). Furthermore, in humans, cultural evolution may also adjust task specificity (Heyes, Reference Heyes2018).
The evolutionary process may also explain the limitations of treating cognition as rational, or optimal. Binz et al. suggest that unrealistic aspects of Bayesian models can be mitigated using resource constraints, for which the offered meta-learning framework is suitable. The problem, however, is that human (and other animal) behavior is not straightforwardly rational, and often appears to defy Bayesian optimization (Tversky & Kahneman, Reference Tversky and Kahneman1981). Moreover, this may not be due to limited resources but because the success of living creatures is determined evolutionarily, rather than by immediate outcomes (Houston, McNamara, & Steer, Reference Houston, McNamara and Steer2007). When behavioral objectives are considered on an evolutionary scale, it may be revealed that they are (locally) optimal (Kacelnik, Reference Kacelnik, Hurley and Nudds2006), and this includes behaviors that depend upon learning, as is generally assumed in behavioral ecology. When tasks for which learning is evolutionarily beneficial end up being learned (i.e., when those individuals who learn have higher fitness), natural selection resolves the meta-learning hard problem of what-to-learn (Dunlap & Stephens, Reference Dunlap and Stephens2016). This may bias the things that animals are able to learn, by shaping the parameter search-space (Prat, Bshary, & Lotem, Reference Prat, Bshary and Lotem2022), maybe of the outer learning loop described by Binz et al. These biases are often addressed in the biological learning literature as sub-problems of the what-to-learn problem, and include when-to-learn or from whom-to-learn (Laland, Reference Laland2004).
We suggest that further advancements in meta-learning thinking require addressing the hard problem of learning as one of their aims. Inspired by (human and nonhuman) biological brains, this should be done by devising overarching objectives for learning algorithms that will enable them to learn what are the learning tasks. In nature, evolution provides some of the solution. Yet, it is not necessary to mimic the evolutionary process per se, but only to acknowledge the generality of evolutionary optimization in the natural world. To this end, it may be better to aspire to simulate nonhuman-animal behavioral studies, rather than psychological assays, since nonhuman animals are trained with no description of the boundaries of their task – they need to realize it by themselves (e.g., when a sparrow learns to relate sand color to food [Ben-Oren, Truskanov, & Lotem, Reference Ben-Oren, Truskanov and Lotem2022]). Thus, these studies usually contain a direct meta-learning challenge that requires solving the problem of what-to-learn.
We postulate that the hard problem in (natural or artificial) intelligence is the question of “what to learn?”. At the fundamental level, this meta-question is resolved in nature by the evolutionary process. The question of “how to learn?”, which is the focus of the meta-learning framework that is presented in the target article, is not especially easy as well, but it can be captured by devising specific training structures and relevant optimization tasks. In general, it requires to specify the “search space” of possible learning strategies. The hard problem of learning, however, is the identification of the learning task itself (i.e., what, and if, to learn) (Niv, Reference Niv2019). For instance, a real-life learner observing several specimens of some unknown insect species (following the example in the target article) must first somehow realize that she is required to evaluate the average length of that species, before she begins to tune her evaluation strategies. This is indeed a different meta-task than presented by Binz et al., but its solution is mandatory for any artificial (somewhat-)general intelligence, and it is regularly handled by the brain (Roli, Jaeger, & Kauffman, Reference Roli, Jaeger and Kauffman2022).
In the quest to devise domain-general learning models, Binz et al. correctly identify the need for diverse (and maybe realistic) training sets. Training a model on many different tasks can achieve high performance in all of them, and maybe even in unrelated, but similar, tasks. Yet, the model will always be constrained by the task-space spanned by its training sets. The major challenge does not lie in amplifying the dimensionality, or variability, of the learned problem, but rather in determining the appropriate objective function. Here, we may be inspired by the observation that biological brains have in general not evolved for their ability to solve a specific task, but, rather, are shaped by the overall success of the organism. On the one hand, evolutionary success obscures the objective of each specific task, since it depends on long-term benefits that are not always clearly related to short-term behavior. On the other hand, evolutionary success is a broader optimization challenge. A generic model that can both solve a maze and evaluate the average length of a newly identified insect, without being trained specifically on these tasks, must solve the hard problem of what-to-learn in a given context. To build a model that addresses this challenge we cannot handcraft the utility function (or error measurements) of each task separately. The meta-learning requirement thus becomes to learn how to identify the utility in learning, or in performing, each of the given tasks, and more broadly, to identify the task itself. Thus, it is constraining to use training sets and error functions that provide the learner with “correct” answers or feedback for each task separately, as is typically done in supervised, semi-supervised, and reinforcement machine learning. The biological brain is overall domain-general since it is not guided by a task-specific “utility function.” Domain-specific processes, such as, maybe, those suggested to process language (Fedorenko & Blank, Reference Fedorenko and Blank2020), demonstrate cases in which natural selection narrowed or “optimized” the task of finding what-to-learn. Other indications may include modularity (Ellefsen, Mouret, & Clune, Reference Ellefsen, Mouret and Clune2015; Sporns & Betzel, Reference Sporns and Betzel2016), alongside sensory adaptations (Warrant, Reference Warrant2016), attention biases (Niv et al., Reference Niv, Daniel, Geana, Gershman, Leong, Radulescu and Wilson2015), and data acquisition mechanisms (Lotem & Halpern, Reference Lotem and Halpern2012). Furthermore, in humans, cultural evolution may also adjust task specificity (Heyes, Reference Heyes2018).
The evolutionary process may also explain the limitations of treating cognition as rational, or optimal. Binz et al. suggest that unrealistic aspects of Bayesian models can be mitigated using resource constraints, for which the offered meta-learning framework is suitable. The problem, however, is that human (and other animal) behavior is not straightforwardly rational, and often appears to defy Bayesian optimization (Tversky & Kahneman, Reference Tversky and Kahneman1981). Moreover, this may not be due to limited resources but because the success of living creatures is determined evolutionarily, rather than by immediate outcomes (Houston, McNamara, & Steer, Reference Houston, McNamara and Steer2007). When behavioral objectives are considered on an evolutionary scale, it may be revealed that they are (locally) optimal (Kacelnik, Reference Kacelnik, Hurley and Nudds2006), and this includes behaviors that depend upon learning, as is generally assumed in behavioral ecology. When tasks for which learning is evolutionarily beneficial end up being learned (i.e., when those individuals who learn have higher fitness), natural selection resolves the meta-learning hard problem of what-to-learn (Dunlap & Stephens, Reference Dunlap and Stephens2016). This may bias the things that animals are able to learn, by shaping the parameter search-space (Prat, Bshary, & Lotem, Reference Prat, Bshary and Lotem2022), maybe of the outer learning loop described by Binz et al. These biases are often addressed in the biological learning literature as sub-problems of the what-to-learn problem, and include when-to-learn or from whom-to-learn (Laland, Reference Laland2004).
We suggest that further advancements in meta-learning thinking require addressing the hard problem of learning as one of their aims. Inspired by (human and nonhuman) biological brains, this should be done by devising overarching objectives for learning algorithms that will enable them to learn what are the learning tasks. In nature, evolution provides some of the solution. Yet, it is not necessary to mimic the evolutionary process per se, but only to acknowledge the generality of evolutionary optimization in the natural world. To this end, it may be better to aspire to simulate nonhuman-animal behavioral studies, rather than psychological assays, since nonhuman animals are trained with no description of the boundaries of their task – they need to realize it by themselves (e.g., when a sparrow learns to relate sand color to food [Ben-Oren, Truskanov, & Lotem, Reference Ben-Oren, Truskanov and Lotem2022]). Thus, these studies usually contain a direct meta-learning challenge that requires solving the problem of what-to-learn.
Acknowledgements
We thank Yoav Ram for insightful comments on a previous version of the manuscript.
Financial support
This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest
None.