Integrative experimentation exploits existing experimental paradigms and dimensions of the corresponding design spaces
Although integrative experimentation facilitates exploration within the prespecified design space, it exploits the information – or lack thereof – that informs the characterization of this space. To perform integrative experiments, scientists must identify a priori a small set of experimental tasks to invest in. Almaatouq et al. present several illustrative examples: Peterson, Bourgin, Agrawal, Reichman, and Griffiths (Reference Peterson, Bourgin, Agrawal, Reichman and Griffiths2021) invested enormous resources to collect human decisions on ~10,000 bandit gambles, Baribault et al. (Reference Baribault, Donkin, Little, Trueblood, Oravecz, Van Ravenzwaaij and Vandekerckhove2018) focused on a specific subliminal priming task, and Awad et al. (Reference Awad, Dsouza, Kim, Schulz, Henrich, Shariff and Rahwan2018, Reference Awad, Dsouza, Bonnefon, Shariff and Rahwan2020) extensively sampled a space of trolley problems. In fields where the nature of the experiments that best measure the underlying constructs are themselves areas of active inquiry, experiments are run under imperfect knowledge about the paradigm that will best capture a target phenomenon. One-at-a-time experimentation enables open-ended, cheap, and sequentially adaptive exploration of experimental paradigms and assumptions about the design spaces corresponding to these paradigms – including exploration along previously unexplored dimensions of a theoretically infinite design space.
Most areas of social and behavioral sciences use experimental manipulations and outcomes to measure unobservable constructs. Social and behavioral scientists in most domains are still engaged in iterative refinement of the experimental paradigms and dimensions of the design space that will best measure these constructs (Dubova & Goldstone, Reference Dubova and Goldstone2023). For instance, while a plethora of paradigms – including the multisource interference task, the task switching paradigm, and the N-back task – are utilized for the study of mental effort, there is little agreement about which experimental manipulations evoke mentally effortful processes, let alone how these manipulations would be combined into an integrative experiment (Bustamante et al., Reference Bustamante, Oshinowo, Lee, Tong, Burton, Shenhav and Daw2022; Koch, Poljac, Müller, & Kiesel, Reference Koch, Poljac, Müller and Kiesel2018; Kool, McGuire, Rosen, & Botvinick, Reference Kool, McGuire, Rosen and Botvinick2010; Shenhav et al., Reference Shenhav, Musslick, Lieder, Kool, Griffiths, Cohen and Botvinick2017; Westbrook & Braver, Reference Westbrook and Braver2015). Here, running integrative experiments can hinder solving the main problem of the field – identifying a set of experimental manipulations relevant to the construct of mental effort.
Almaatouq et al. give examples of areas in the social and behavioral sciences that are dominated by a small set of “standard” experimental paradigms, such as bandit gambles for risky decision making. In these cases, integrative experimentation can facilitate efficient exploration of behavior across the space defined by these paradigms. In other cases, however, integrative experimentation may actually hinder exploration of the target phenomena. For instance, early vision science operated in design spaces involving artificial visual stimuli. While integrative experimentation would have yielded theoretical commensurability in this space, one-at-a-time experiments (i.e., the use of stimuli that differed from the common design space) enabled a quick expansion of the space to natural stimuli that in turn led to rapid revisions of dominant theories of vision (Olshausen & Field, Reference Olshausen and Field2005; Zhaoping, Reference Zhaoping2014). Thus, scientific inquiry may often not justify a large investment of resources and interinstitutional coordination at the expense of expanding the design space or developing a number of completely new tasks.
Integrative experimentation exploits existing theoretical paradigms
Almaatouq et al. advocate for using integrative experiments to enforce commensurability of theoretical accounts for the data. However, this approach may prematurely prioritize some theoretical frameworks over others. For example, the BrainScore benchmark integrates neuroimaging studies on visual object recognition to standardize the comparison of formal theories of neural visual processing (Schrimpf et al., Reference Schrimpf, Kubilius, Lee, Murty, Ajemian and DiCarlo2020). Although aiming for inclusivity, BrainScore's design required certain commitments, such as the set of target phenomena and measurements to be accounted for (i.e., neural recordings in object recognition experiments) and the form that the theories can take (i.e., neural networks mimicking the ventral stream, taking pixels as inputs, and predicting behavioral responses). Equally justified alternative benchmarks could have led to different theories of visual processing being prioritized: For instance, the dataset could have emphasized temporal aspects of vision, or clumped together object recognition with visual search tasks when identifying the domain space for theories to capture. Similarly, standardizing theoretical accounts by the constraints imposed by integrative experiments, which often focus on a single experimental paradigm, may hinder exploration of theoretical frameworks that target different aspects of the phenomena.
Many, if not most, areas of social and behavioral sciences would benefit from facilitating investigation of a larger class of theoretical paradigms, rather than constraining theory-building. For example, cognitive science consists of incommensurable theoretical paradigms, such as rational analysis and dynamical systems, which make predictions about different, often nonoverlapping, aspects of cognitive phenomena. For instance, dynamical systems modeling seeks to capture the temporal aspects of a cognitive process, whereas rational analysis focuses on the outcomes of cognition. A diversity of theoretical frameworks informs the design of new experimental paradigms, broadens the collective conceptualization of the relevant design spaces (Chang, Reference Chang2012; Massimi, Reference Massimi2022), and contributes to more comprehensive insights on cognition (Krakauer, Ghazanfar, Gomez-Marin, MacIver, & Poeppel, Reference Krakauer, Ghazanfar, Gomez-Marin, MacIver and Poeppel2017; Marr, Reference Marr1982; Medin & Bang, Reference Medin and Bang2014). Constraining theory-building risks reinforcing biases that favor existing experimental paradigms, further inhibiting exploration of novel experimental and theoretical frameworks (Dubova, Moskvichev, & Zollman, Reference Dubova, Moskvichev and Zollman2022; Sloman, Oppenheimer, Broomell, & Shalizi, Reference Sloman, Oppenheimer, Broomell and Shalizi2022).
Almaatouq et al. argue that one-at-a-time experiments hamper efficient exploration of target phenomena and theoretical integration. To address this, they suggest integrative experimentation: Data collection in a large, predetermined, experimental design space. Although integrative experimentation addresses many limitations of current experimental practices in the social and behavioral sciences, we argue that integrative experimentation risks being prematurely exploitative by (a) committing to existing experimental paradigms and dimensions of the corresponding design space, and (b) imposing constraints on theory-building. One-at-a-time experimentation serves a critical role in exploring useful experimental and theoretical paradigms that can then be effectively exploited by integrative experimentation.
Integrative experimentation exploits existing experimental paradigms and dimensions of the corresponding design spaces
Although integrative experimentation facilitates exploration within the prespecified design space, it exploits the information – or lack thereof – that informs the characterization of this space. To perform integrative experiments, scientists must identify a priori a small set of experimental tasks to invest in. Almaatouq et al. present several illustrative examples: Peterson, Bourgin, Agrawal, Reichman, and Griffiths (Reference Peterson, Bourgin, Agrawal, Reichman and Griffiths2021) invested enormous resources to collect human decisions on ~10,000 bandit gambles, Baribault et al. (Reference Baribault, Donkin, Little, Trueblood, Oravecz, Van Ravenzwaaij and Vandekerckhove2018) focused on a specific subliminal priming task, and Awad et al. (Reference Awad, Dsouza, Kim, Schulz, Henrich, Shariff and Rahwan2018, Reference Awad, Dsouza, Bonnefon, Shariff and Rahwan2020) extensively sampled a space of trolley problems. In fields where the nature of the experiments that best measure the underlying constructs are themselves areas of active inquiry, experiments are run under imperfect knowledge about the paradigm that will best capture a target phenomenon. One-at-a-time experimentation enables open-ended, cheap, and sequentially adaptive exploration of experimental paradigms and assumptions about the design spaces corresponding to these paradigms – including exploration along previously unexplored dimensions of a theoretically infinite design space.
Most areas of social and behavioral sciences use experimental manipulations and outcomes to measure unobservable constructs. Social and behavioral scientists in most domains are still engaged in iterative refinement of the experimental paradigms and dimensions of the design space that will best measure these constructs (Dubova & Goldstone, Reference Dubova and Goldstone2023). For instance, while a plethora of paradigms – including the multisource interference task, the task switching paradigm, and the N-back task – are utilized for the study of mental effort, there is little agreement about which experimental manipulations evoke mentally effortful processes, let alone how these manipulations would be combined into an integrative experiment (Bustamante et al., Reference Bustamante, Oshinowo, Lee, Tong, Burton, Shenhav and Daw2022; Koch, Poljac, Müller, & Kiesel, Reference Koch, Poljac, Müller and Kiesel2018; Kool, McGuire, Rosen, & Botvinick, Reference Kool, McGuire, Rosen and Botvinick2010; Shenhav et al., Reference Shenhav, Musslick, Lieder, Kool, Griffiths, Cohen and Botvinick2017; Westbrook & Braver, Reference Westbrook and Braver2015). Here, running integrative experiments can hinder solving the main problem of the field – identifying a set of experimental manipulations relevant to the construct of mental effort.
Almaatouq et al. give examples of areas in the social and behavioral sciences that are dominated by a small set of “standard” experimental paradigms, such as bandit gambles for risky decision making. In these cases, integrative experimentation can facilitate efficient exploration of behavior across the space defined by these paradigms. In other cases, however, integrative experimentation may actually hinder exploration of the target phenomena. For instance, early vision science operated in design spaces involving artificial visual stimuli. While integrative experimentation would have yielded theoretical commensurability in this space, one-at-a-time experiments (i.e., the use of stimuli that differed from the common design space) enabled a quick expansion of the space to natural stimuli that in turn led to rapid revisions of dominant theories of vision (Olshausen & Field, Reference Olshausen and Field2005; Zhaoping, Reference Zhaoping2014). Thus, scientific inquiry may often not justify a large investment of resources and interinstitutional coordination at the expense of expanding the design space or developing a number of completely new tasks.
Integrative experimentation exploits existing theoretical paradigms
Almaatouq et al. advocate for using integrative experiments to enforce commensurability of theoretical accounts for the data. However, this approach may prematurely prioritize some theoretical frameworks over others. For example, the BrainScore benchmark integrates neuroimaging studies on visual object recognition to standardize the comparison of formal theories of neural visual processing (Schrimpf et al., Reference Schrimpf, Kubilius, Lee, Murty, Ajemian and DiCarlo2020). Although aiming for inclusivity, BrainScore's design required certain commitments, such as the set of target phenomena and measurements to be accounted for (i.e., neural recordings in object recognition experiments) and the form that the theories can take (i.e., neural networks mimicking the ventral stream, taking pixels as inputs, and predicting behavioral responses). Equally justified alternative benchmarks could have led to different theories of visual processing being prioritized: For instance, the dataset could have emphasized temporal aspects of vision, or clumped together object recognition with visual search tasks when identifying the domain space for theories to capture. Similarly, standardizing theoretical accounts by the constraints imposed by integrative experiments, which often focus on a single experimental paradigm, may hinder exploration of theoretical frameworks that target different aspects of the phenomena.
Many, if not most, areas of social and behavioral sciences would benefit from facilitating investigation of a larger class of theoretical paradigms, rather than constraining theory-building. For example, cognitive science consists of incommensurable theoretical paradigms, such as rational analysis and dynamical systems, which make predictions about different, often nonoverlapping, aspects of cognitive phenomena. For instance, dynamical systems modeling seeks to capture the temporal aspects of a cognitive process, whereas rational analysis focuses on the outcomes of cognition. A diversity of theoretical frameworks informs the design of new experimental paradigms, broadens the collective conceptualization of the relevant design spaces (Chang, Reference Chang2012; Massimi, Reference Massimi2022), and contributes to more comprehensive insights on cognition (Krakauer, Ghazanfar, Gomez-Marin, MacIver, & Poeppel, Reference Krakauer, Ghazanfar, Gomez-Marin, MacIver and Poeppel2017; Marr, Reference Marr1982; Medin & Bang, Reference Medin and Bang2014). Constraining theory-building risks reinforcing biases that favor existing experimental paradigms, further inhibiting exploration of novel experimental and theoretical frameworks (Dubova, Moskvichev, & Zollman, Reference Dubova, Moskvichev and Zollman2022; Sloman, Oppenheimer, Broomell, & Shalizi, Reference Sloman, Oppenheimer, Broomell and Shalizi2022).
Integrative and one-at-a-time experimentation benefit fields with different goals at different stages of their development
Viewed from a resource allocation perspective, scientific endeavors face an explore–exploit dilemma. Integrative experimentation facilitates broad characterization of behavior within a specific paradigm and its corresponding design space. One-at-a-time experimentation encourages iterative refinement of experimental paradigms and the development of new theoretical frameworks. We believe a combination of integrative and one-at-a-time experimentation is needed to effectively address the explore–exploit problem in sciences.
Financial support
S. M. is supported by Schmidt Science Fellows, in partnership with the Rhodes Trust. M. R. N. is supported by NIH RO1 MH126971.
Competing interest
None.