Hostname: page-component-84b7d79bbc-g78kv Total loading time: 0 Render date: 2024-07-26T10:25:55.221Z Has data issue: false hasContentIssue false

Direct replications in the era of open sampling

Published online by Cambridge University Press:  27 July 2018

Gabriele Paolacci
Affiliation:
Rotterdam School of Management, Erasmus University Rotterdam, 3062 PA, Rotterdam, The Netherlands. gpaolacci@rsm.nlhttps://www.rsm.nl/people/gabriele-paolacci/
Jesse Chandler
Affiliation:
Mathematica Policy Research, Ann Arbor, MI 48104. Institute for Social Research, University of Michigan Ann Arbor, MI 48109. jjchandl@umich.eduhttps://www.jessechandler.com

Abstract

Data collection in psychology increasingly relies on “open populations” of participants recruited online, which presents both opportunities and challenges for replication. Reduced costs and the possibility to access the same populations allows for more informative replications. However, researchers should ensure the directness of their replications by dealing with the threats of participant nonnaiveté and selection effects.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

When the “crisis of confidence” struck psychology, giving a new pace to the academic debate on replications, a parallel revolution was happening in the field: Data collection rapidly moved away from near exclusive dependence on traditional participant pools (e.g., undergraduate samples) and towards sampling from online marketplaces where adults complete tasks (e.g., academic surveys) in exchange for compensation. About five years later, virtually any major journal in psychology and beyond routinely publishes studies conducted on Amazon Mechanical Turk, Prolific, or other third-party platforms (Chandler & Shapiro Reference Chandler and Shapiro2016; Stewart et al. Reference Stewart, Chandler and Paolacci2017). Importantly, these marketplaces are typically “open” on both ends: Compared to any traditional participant pool (e.g., psychology undergraduates in a Midwestern university), few restrictions exist about who can join the participant pool and who can recruit participants from these populations.

Zwaan et al. provide a compelling case for direct replications, emphasizing both the necessity of being able to reproduce the procedures used in the original experiments and the lack of structural obstacles to make replications a habit in the field. However, Zwaan et al. do not discuss how direct replications are affected by the current practices of data collection, and in particular by researchers' increasing reliance on open sampling. We build on the target article by highlighting how open sampling presents opportunities to make direct replication mainstream and the challenges of conducting a proper direct replication using these samples.

Open sampling can remove barriers to making replications habitual, while also making attempted replications more conclusive and compelling. First, data collection from open populations is comparatively faster and cheaper (even controlling for pay rate, Goodman & Paolacci Reference Goodman and Paolacci2017). This reduces concerns about committing scarce resources to replication, and allows recruiting larger samples given the same time and budget allocated to conducting a replication. This is beneficial for any study and particularly for replication studies that demand even more participants than original studies to make conclusive statements (Simonsohn Reference Simonsohn2015).

Second, original studies conducted on open populations can be replicated by different researchers using the same population. Sharing the same population does not make replications perfect, and we discuss below how this is also true of open populations; however, a shared population is a necessary precondition for more informative failed replications. Samples from different sources vary substantially on many characteristics, which can sometimes have a substantive impact on results (Krupnikov & Levine, Reference Krupnikov and Levine2014). All else being equal, a failed replication on the same population is both less suggestive of hidden moderators and less ambiguous about which “hidden moderators” (if any) might be at play. When the replicator's goal is to increase the directness of a replication, rather than discovering population-level moderators of the target effect, open populations further reduce the “Context Is Too Variable” concern that Zwaan et al. address in the target article.

Despite these advantages, open sampling only increases the directness of a replication if researchers pay appropriate attention to the sampling methodology. First, despite intuitions of the contrary, open populations have a large but limited number of participants (Difallah et al. Reference Difallah, Filatova and Ipeirotis2018; Stewart et al. Reference Stewart, Ungemach, Harris, Bartels, Newell, Paolacci and Chandler2015). Combined with researchers using these populations to conduct many studies that are often high-powered, this has resulted in concerns about participant nonnaiveté that are relevant to replication. Open populations include many participants who are experienced with research participation, and who become more experienced over time with specific research paradigms and instruments. Illustratively, popular paradigms are known to a large majority of participants (e.g., Chandler et al. Reference Chandler, Mueller and Paolacci2014, Thomson & Oppenheimer Reference Thomson and Oppenheimer2016). Zwaan et al. highlight how some findings in cognitive psychology (i.e., perception/action, memory, and language) replicate even with participants who were previously exposed to them (Zwaan et al. Reference Zwaan, Pecher, Paolacci, Bouwmeester, Verkoeijen, Dijkstra and Zeelenberg2017). However, this is not necessarily the case for any paradigm, and may be particularly not true of replications in other psychological fields. There is evidence that experimental manipulations in social psychology and decision-making that convey experience (e.g., tasks conducted under time pressure) or factual knowledge (e.g., numeric estimates following different numeric anchors) become less strong with repeated exposure. This can result in replications that are less statistically powerful than intended (Chandler et al. Reference Chandler, Paolacci, Peer, Mueller and Ratliff2015, Devoe & House Reference DeVoe and House2016, Rand et al. 2014), and participant nonnaiveté should therefore be accounted for by direct replicators (Chandler et al. Reference Chandler, Mueller and Paolacci2014).

Second, samples obtained from open populations are not probability samples, and thus can still vary as a result of procedural differences in sampling. Participants of open populations self-select into studies by choosing from many that differ across observable characteristics (e.g., payment, task description) that may make them more or less attractive to different people. Researchers may place explicit constraints on participant eligibility that have a measurable impact on data quality (e.g., worker reputation scores; Peer et al. Reference Peer, Vosgerau and Acquisti2014 or nationality; Chandler & Shapiro Reference Chandler and Shapiro2016) but may not be reported. Other recruitment criteria that are not deliberately selected may still be impactful. The diversity of open populations compounds this concern, because it suggests a comparatively high potential for procedural differences to meaningfully affect sample composition. Though evidence is still scarce, researchers have found that sample demographics fluctuate with time-of-the-day and day-of-the week (Arechar et al. Reference Arechar, Kraft-Todd and Rand2017; Casey et al. Reference Casey, Chandler, Levine, Proctor and Strolovitch2017). This implies the need for direct replicators to consider aspects of the original design (e.g., timing, study compensation) that are not typically assumed to be hidden moderators in undergraduate samples that are less diverse and less characterized by self-selection. It also emphasizes that, in the era of open samples, original authors are as responsible as direct replicators to support replicability efforts by reporting their sampling choices in sufficient detail to ensure meaningful replication.

In sum, we applaud the target article on convincingly addressing the most commonly raised concerns about replication, and put some of the target article's insights within the context of today's dominant practice in data collection—open sampling. We hope this commentary will contribute to make informative replication mainstream, by encouraging researchers to both embrace the advantages of open sampling and consider what transparent reporting of sampling methods and direct replication means when using these samples.

References

Arechar, A. A., Kraft-Todd, G. T. & Rand, D. G. (2017) Turking overtime: How participant characteristics and behavior vary over time and day on Amazon Mechanical Turk. Journal of the Economic Science Association 3(1):111.Google Scholar
Casey, L., Chandler, J., Levine, A. S., Proctor, A. & Strolovitch, D. Z. (2017, April–June) Intertemporal differences among MTurk worker demographics. SAGE Open, 115. doi: 10.1177/2158244017712774.Google Scholar
Chandler, J., Mueller, P. & Paolacci, G. (2014) Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods 46(1):112–30.Google Scholar
Chandler, J., Paolacci, G., Peer, E., Mueller, P. & Ratliff, K. A. (2015) Using nonnaive participants can reduce effect sizes. Psychological Science 26(7):1131–39.Google Scholar
Chandler, J. & Shapiro, D. (2016) Conducting clinical research using crowdsourced convenience samples. Annual Review of Clinical Psychology 12:5381.Google Scholar
DeVoe, S. E. & House, J. (2016). Replications with MTurkers who are naïve versus experienced with academic studies: A comment on Connors, Khamitov, Moroz, Campbell, and Henderson (2015). Journal of Experimental Social Psychology 67:6567.Google Scholar
Difallah, D., Filatova, E. & Ipeirotis, P. (2018) Demographics and dynamics of mechanical Turk workers. In: Proceedings of WSDM 2018: The Eleventh ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA February 5–9, 2018, pp. 135143. Available at: https://dl.acm.org/citation.cfm?doid=3159652.3159661.Google Scholar
Goodman, J. K. & Paolacci, G. (2017) Crowdsourcing consumer research. Journal of Consumer Research 44(1):196210.Google Scholar
Krupnikov, Y. & Levine, A. S. (2014). Cross-sample comparisons and external validity. Journal of Experimental Political Science 1(1), 5980.Google Scholar
Peer, E., Vosgerau, J. & Acquisti, A. (2014) Reputation as a sufficient condition for data quality on Amazon Mechanical Turk. Behavior Research Methods 46(4):1023–31.Google Scholar
Simonsohn, U. (2015) Small telescopes: Detectability and the evaluation of replication results. Psychological Science 26:559–69.Google Scholar
Stewart, N., Chandler, J. & Paolacci, G. (2017) Crowdsourcing samples in cognitive science. Trends in Cognitive Sciences 21(10):736–48.Google Scholar
Stewart, N., Ungemach, C., Harris, A. J., Bartels, D. M., Newell, B. R., Paolacci, G. & Chandler, J. (2015). The average laboratory samples a population of 7,300 Amazon Mechanical Turk workers. Judgment and Decision Making 10(5):479–91.Google Scholar
Thomson, K. S. & Oppenheimer, D. M. (2016) Investigating an alternate form of the cognitive reflection test. Judgment and Decision Making 11(1):99113.Google Scholar
Zwaan, R. A., Pecher, D., Paolacci, G., Bouwmeester, S., Verkoeijen, P., Dijkstra, K. & Zeelenberg, R. (2017) Participant nonnaiveté and the reproducibility of cognitive psychology. Psychonomic Bulletin and Review. Available at: http://doi.org/10.3758/s13423-017-1348-y.Google Scholar