Skip to main content Accessibility help
×
Hostname: page-component-cd9895bd7-jn8rn Total loading time: 0 Render date: 2024-12-24T03:29:25.992Z Has data issue: false hasContentIssue false

Can We Be Wrong? The Problem of Textual Evidence in a Time of Data

Published online by Cambridge University Press:  22 September 2020

Andrew Piper
Affiliation:
McGill University, Montréal

Summary

This Element tackles the problem of generalization with respect to text-based evidence in the field of literary studies. When working with texts, how can we move, reliably and credibly, from individual observations to more general beliefs about the world? The onset of computational methods has highlighted major shortcomings of traditional approaches to texts when it comes to working with small samples of evidence. This Element combines a machine learning-based approach to detect the prevalence and nature of generalization across tens of thousands of sentences from different disciplines alongside a robust discussion of potential solutions to the problem of the generalizability of textual evidence. It exemplifies the way mixed methods can be used in complementary fashion to develop nuanced, evidence-based arguments about complex disciplinary issues in a data-driven research environment.
Get access
Type
Element
Information
Online ISBN: 9781108922036
Publisher: Cambridge University Press
Print publication: 19 November 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ash, M., Herndon, T., & Pollin, R. (2013). Does high profile debt consistently stifle economic growth? A critique of Reinhart and Rogo. Political Economy Research Institute Working Paper Series, no. 322.Google Scholar
Baggerly, K. A. & Coombes, K. R. (2009). Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology. Annals of Applied Statistics, 3, 13091344. https://doi.org/10.1214/09-AOAS291Google Scholar
Berkeley, G. (1710). A Treatise Concerning the Principles of Human Knowledge, London: Printed for Jacob Tonson.Google Scholar
Bode, K. (2018). A World of Fiction: Digital Collections and the Future of Literary History, Ann Arbor: University of Michigan Press.Google Scholar
Bode, K.(2020). Why you can’t model away bias. Modern Language Quarterly, 81(1), 95124. https://doi.org/10.1215/00267929–7933102Google Scholar
Bossaerts, P., Camerer, C., Fiorillo, C. D. et al. (2008). Explicit neural signals reflecting reward uncertainty. Phil. Trans. R. Soc. B., 363, 38013811. https://doi.org/10.1098/rstb.2008.0152Google Scholar
Bourrier, K. & Thelwall, M. (2020). The social lives of books: Reading Victorian literature on Goodreads. Journal of Cultural Analytics. https://doi.org/10.22148/001c.12049Google Scholar
Bowersock, G. W. (2008). Introduction. In Valla, L, On the Donation of Constantine, trans. G. W. Bowersock, Cambridge, MA: Harvard University Press.Google Scholar
Buurma, R. S. & Heffernan, L. (2012). The common reader and the archival classroom: Disciplinary history for the twenty-first century. New Literary History, 43(1), 113135. https://doi.org/10.1353/nlh.2012.0005CrossRefGoogle Scholar
Camporeale, S. I. (1996). Lorenzo Valla’s oratio on the pseudo-donation of Constantine: Dissent and innovation in early Renaissance humanism. Journal of the History of Ideas, 57(1), 926. https://doi.org/10.2307/3653880Google Scholar
Contessa, G. (2007). Scientific representation, interpretation, and surrogative reasoning. Philosophy of Science, 74(1), 4868. https://doi.org/10.1086/519478Google Scholar
Degaetano-Ortlieb, S., Hannah, K., Khamis, A., & Teich, E. (2018). An information-theoretic approach to modeling diachronic change in scientific English. From Data to Evidence in English Language Research. Suhr, C., Nevalainen, T., & Taavitsainen, I. (Eds.), Leiden: Brill. 258281.Google Scholar
Degaetano-Ortlieb, S. & Piper, A. (2019). The scientization of literary study. Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 1828. https://doi.org/10.18653/v1/W19-2503Google Scholar
Degaetano-Ortlieb, S. & Teich, E. (2017). Modeling intra-textual variation with entropy and surprisal: Topical vs. stylistic patterns. Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 6877. https://doi.org/10.18653/v1/W17-2209Google Scholar
Devlin, J., Chang, M-W., Lee, K., Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of NAACL-HLT, 4171–4186.Google Scholar
Dilthey, W. (1922). Das Erlebnis und die Dichtung. Lessing, Goethe, Novalis, Hölderlin, 8th ed., Wiesbaden: Springer.Google Scholar
Earp, B. D. & Trafimow, D. (2015). Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2015.00621Google Scholar
Eco, U. (1989). The Open Work, Cambridge, MA: Harvard University Press.Google Scholar
Emre, M. (2018). Paraliterary: The Making of Bad Readers in Postwar America, Chicago: University of Chicago Press.Google Scholar
Erickson, L. & Thiessen, E. D. (2015). Statistical learning of language: Theory, validity, and predictions of a statistical learning account of language acquisition. Developmental Review, 37, 66108. https://doi.org/10.1016/j.dr.2015.05.002Google Scholar
Evans, J. A. & Foster, J. G. (2011). Metaknowledge. Science, 331(6018), 721725. https://doi.org/10.1126/science.1201765CrossRefGoogle ScholarPubMed
Evans, J. A. & McMahan, P. (2018). Ambiguity and engagement. American Journal of Sociology, 124(3), 860912. https://doi.org/10.1086/701298Google Scholar
Felski, R. (2008). Uses of Literature, Malden, MA: Blackwell Publishing.Google Scholar
Felski, R.(2015). The Limits of Critique, Chicago: University of Chicago Press.Google Scholar
Felski, R.ed. (2017). Special issue: For example. New Literary History, 48(3), 415608.Google Scholar
Fish, S. E. (1976). Interpreting the “Variorum.Critical Inquiry, 2(3), 465485. https://doi.org/10.1086/447852Google Scholar
Fohrmann, J. & Voßkamp, W. eds. (1994). Wissenschaftsgeschichte der Germanistik im 19. Jahrhundert, Stuttgart: Metzler.Google Scholar
Foster, E. D. & Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association: JMLA, 105(2), 203206. https://doi.org/10.5195/jmla.2017.88Google Scholar
Gadamer, H.-G. (1990). Wahrheit und Methode: Grundzüge einer philosophischen Hermeneutik, Tübingen: Mohr.Google Scholar
Gius, E., Reiter, N., & Willand, M. (2019). Foreword to the Special Issue: “A shared task for the digital humanities: Annotating narrative levels,” Journal of Cultural Analytics. https://doi.org/10.22148/16.047Google Scholar
Gladen, B. & Rogan, W .J. (1978). Estimating prevalence from the results of a screening test. American Journal of Epidemiology, 107(1), 7176. https://doi.org/10.1093/Oxfordjournals.Aje.A112510Google Scholar
Goldhill, S. (2017). The limits of the case study: Exemplarity and the reception of classical literature. New Literary History, 48(3), 415435. https://doi.org/10.1353/nlh.2017.0023Google Scholar
Goldstone, A. & Underwood, T. (2014). The quiet transformations of literary studies: What thirteen thousand scholars could tell us. New Literary History, 45(3), 359384. https://doi.org/10.7282/T3222RZTGoogle Scholar
Grafton, A. (1994). Defenders of the Text: The Traditions of Scholarship in an Age of Science, 1450–1800, Cambridge, MA: Harvard University Press.Google Scholar
Grafton, A. & Jardine, L. (1986). From Humanism to the Humanities: Education and the Liberal Arts in Fifteenth- and Sixteenth-Century Europe, Cambridge, MA: Harvard University Press.Google Scholar
Guillory, J. (1993). Cultural Capital: The Problem of Literary Canon Formation, Chicago: University of Chicago Press.Google Scholar
Guo, P., Ma, Z., & Stodden, V. (2013). Toward reproducible computational research: An empirical analysis of data and code policy adoption by journals. PLOS ONE, 8(6), e67111. https://doi.org/10.1371/journal.pone.0067111Google Scholar
Hacking, I. (1999). The Social Construction of What? Cambridge, MA: Harvard University Press.Google Scholar
Harada, T., Koeda, T., Ohno, K. et al. (2012). Distinction between the literal and intended meanings of sentences: A functional magnetic resonance imaging study of metaphor and sarcasm. Cortex, 48(5), 563583. https://doi.org/10.1016/j.cortex.2011.01.004Google Scholar
Hayot, E. (2016). Against historicist fundamentalism. PMLA, 131(5), 14141422. https://doi.org/10.1632/pmla.2016.131.5.1414Google Scholar
Horvat, M., Mlinarić, A., & Smolčić, V. S. (2017). Dealing with the positive publication bias: Why you should really publish your negative results. Biochemia medica, 27(3), 030201. https://doi.org/10.11613/BM.2017.030201Google Scholar
Hunter, M. C. (2010). Experiment, theory, representation: Robert Hooke’s material models. In Frigg, R. & Springer, H., Beyond Mimesis and Convention: Representation in Art and Science, Dordrecht: Springer Science+Business Media, 193219.Google Scholar
Iqbal, F., Binsalleeh, H., Fung, B. C. M., & Debbabi, M. (2013). A unified data mining solution for authorship analysis in anonymous textual communications. Information Sciences, 231, 98112. https://doi.org/10.1016/j.ins.2011.03.006CrossRefGoogle Scholar
Jameson, F. (1986). Third world literature in the era of multi-national capitalism. Social Text, 15, 6588. https://doi.org/10.2307/466493Google Scholar
Kant, I. (1996) Kritik der Urteilskraft, Frankfurt/Main: Suhrkamp.Google Scholar
Kelley, D. R. (1970). Foundations of Modern Historical Scholarship: Language, Law, and History in the French Renaissance, New York: Columbia University Press, 1946.Google Scholar
Kopp, D. & Wegmann, N. (1988). Wenige wissen noch, wie Leser lieset.’ Anmerkungen zum Thema: Lesen und Geschwindigkeit. In Oellers, N., ed. Germanistik und Deutschunterricht im Zeitalter der Technologie: Selbstbestimmung und Anpassung, Tübingen: Niemeyer, 92104.Google Scholar
Laplace, P. S. (1841). Mémoire sur la probabilité des causes par les évènements. Oeuvres complètes, 8, 2765.Google Scholar
Lerer, S. (2002). Error and the Academic Self: The Scholarly Imagination, Medieval to Modern, New York: Columbia University Press.Google Scholar
Levine, C. (2017). Model thinking: Generalization, political form, and the common good. New Literary History, 48(4), 633653. https://doi.org/10.1353/nlh.2017.0033Google Scholar
Locke, J. (1824). Essay Concerning Human Understanding, 12th ed., London: Rivington.Google Scholar
Martinez-Fuentes, C. & Vicente-Saez, R. (2018). Open Science now: A systematic literature review for an integrated definition. Journal of Business Research, 88, 428436. https://doi.org/10.1016/j.jbusres.2017.12.043Google Scholar
Nauta, L. (2009). In Defense of Common Sense: Lorenzo Valla’s Humanist Critique of Scholastic Philosophy, Cambridge, MA: Harvard University Press.Google Scholar
Nelson, L. D., Simmons, J. P. & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 13591366. https://doi.org/10.1177/0956797611417632Google Scholar
Open Science Collaboration. (2012). An open, large-scale, collaborative effort to estimate the reproducibility of psychological science. Perspectives on Psychological Science, 7(6), 657660. https://doi.org/10.1177/1745691612462588Google Scholar
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349, 6251. https://doi.org/10.1126/science.aac4716Google Scholar
Payne, G. & Williams, M. (2005). Generalization in qualitative research. Sociology, 39(2), 295314. https://doi.org/10.1177/0038038505050540Google Scholar
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L. (2018). Deep contextualized word representations. Proceedings of NAACL-HLT, 2227–2237. https://doi.org/10.18653/v1/N18-1202Google Scholar
Piper, A. (2017). Think small: On literary modeling. PMLA, 132(3), 651658. https://doi.org/10.1632/pmla.2017.132.3.651Google Scholar
Piper, A. & Sachs, J. (2018). Technique and the time of reading. PMLA, 133(5), 12591267. http://doi.org/10.1632/pmla.2018.133.5.1259Google Scholar
Piper, A. & Wellmon, C. (2017). Publication, power and patronage: On inequality and academic publishing. Critical Inquiry. http://criticalinquiry.uchicago.edu/publication_power_and_patronage_on_inequality_and_ academic_publishing/Google Scholar
Polanyi, M. (2009). The Tacit Dimension, Chicago: University of Chicago Press.Google Scholar
Popper, K. (1935). Logik der Forschung. Zur Erkenntnistheorie der modernen Naturwissenschaft, Wien: Springer.Google Scholar
Porter, T. M. (1986). The Rise of Statistical Thinking, 1820–1900, Princeton: Princeton University Press.CrossRefGoogle Scholar
Reichenbach, H. (1930). Kausalität und Wahrscheinlichkeit. Erkenntnis, 1, 158188. https://doi.org/10.1007/BF00208615Google Scholar
Reynolds, L. D. & Wilson, N. G. (1991). Scribes and Scholars: A Guide to the Transmission of Greek and Latin Literature, 3rd ed., Oxford: Clarendon Press.Google Scholar
Rogan, W. J. & Gladen, B. (1978). Estimating prevalence from the results of a screening test. American Journal of Epidemiology, 107(1), 7176.Google Scholar
Rogers, D. (1650). Naaman the Syrian, His Disease and Cure, London: Printed by Th. Harper for Philip Nevil.Google Scholar
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1, 206215. https://doi.org/10.1038/s42256-019-0048-xGoogle Scholar
Schleiermacher, F. (1977). Hermeneutik und Kritik, Frankfurt/Main: Suhrkamp.Google Scholar
Schultz, W., Preuschoff, K., Camerer, C., Hsu, M., Fiorillo, C.D., Tobler, P.N. and Bossaerts, P. (2008). Explicit neural signals reflecting reward uncertainty. Phil. Trans. R. Soc. B, 363: 3801–3811. https://doi.org/10.1098/rstb.2008.0152Google Scholar
Shapin, S. (1994). A Social History of Truth: Civility and Science in Seventeenth-Century England. Chicago: University of Chicago Press.Google Scholar
So, R. J. (2017). All models are wrong. PMLA, 132(3), 668673. https://doi.org/10.1632/pmla.2017.132.3.668Google Scholar
Spellman, B. (2015). A short (personal) future history of revolution 2.0. Perspectives on Psychological Science, 10(6), 886899. https://doi.org/10.1177/1745691615609918CrossRefGoogle Scholar
Stigler, S. (1986). The History of Statistics Before 1900, Cambridge, MA: Belknap Press.Google Scholar
Thiessen, E. D. (2017). What’s statistical about learning? Insights from modelling statistical learning as a set of memory processes. Phil. Trans. R. Soc. B., 372, 20160056.Google Scholar
Uchiyama, H. T., Saito, D. N., Tanabe, H. C., Harada, T., Seki, A., Ohno, K., Koeda, T., & Sadato, N. (2012). Distinction between the literal and intended meanings of sentences: a functional magnetic resonance imaging study of metaphor and sarcasm. Cortex, 48(5), 563583. https://doi.org/10.1016/j.cortex.2011.01.004Google Scholar
Underwood, T. (2016). Why Literary Periods Mattered, Stanford: Stanford University Press.Google Scholar
Valla, L. (2008). On the Donation of Constantine, trans. G. W. Bowersock. Cambridge,MA: Harvard University Press.Google Scholar
Viswanathan, G. (1989). Masks of Conquest: Literary Study and British Rule in India, New York: Columbia University Press.Google Scholar
Wellmon, C. & Reitter, P. (forthcoming 2021). Permanent Crisis: The Humanities in a Disenchanted Age, Chicago: University of Chicago Press.Google Scholar
Williams, M. (2000). Interpretivism and generalisation. Sociology, 34(2), 209224. https://doi.org/10.1177/S0038038500000146Google Scholar
Willinsky, J. (2006). The Access Principle: The Case for Open Access to Research and Scholarship, Cambridge, MA: MIT Press.Google Scholar
Wolfson, S. J. (1986). The Questioning Presence: Wordsworth, Keats, and the Interrogative Mode in Romantic Poetry, Ithaca: Cornell University Press.Google Scholar

Save element to Kindle

To save this element to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Can We Be Wrong? The Problem of Textual Evidence in a Time of Data
  • Andrew Piper, McGill University, Montréal
  • Online ISBN: 9781108922036
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Can We Be Wrong? The Problem of Textual Evidence in a Time of Data
  • Andrew Piper, McGill University, Montréal
  • Online ISBN: 9781108922036
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Can We Be Wrong? The Problem of Textual Evidence in a Time of Data
  • Andrew Piper, McGill University, Montréal
  • Online ISBN: 9781108922036
Available formats
×