Can We Be Wrong? The Problem of Textual Evidence in a Time of Data

Andrew Piper

doi:10.1017/9781108922036

Series: Elements in Digital Literary Studies

Can We Be Wrong? The Problem of Textual Evidence in a Time of Data

Published online by Cambridge University Press: 22 September 2020

Andrew Piper

Show author details

Andrew Piper: Affiliation:
McGill University, Montréal

Summary

This Element tackles the problem of generalization with respect to text-based evidence in the field of literary studies. When working with texts, how can we move, reliably and credibly, from individual observations to more general beliefs about the world? The onset of computational methods has highlighted major shortcomings of traditional approaches to texts when it comes to working with small samples of evidence. This Element combines a machine learning-based approach to detect the prevalence and nature of generalization across tens of thousands of sentences from different disciplines alongside a robust discussion of potential solutions to the problem of the generalizability of textual evidence. It exemplifies the way mixed methods can be used in complementary fashion to develop nuanced, evidence-based arguments about complex disciplinary issues in a data-driven research environment.

Element contents

Summary
References

Get access

Keywords

digital humanities humanities machine learning literary studies text mining

Information

Type: Element
Information: Series: Elements in Digital Literary Studies

DOI: https://doi.org/10.1017/9781108922036 [Opens in a new window]

Online ISBN: 9781108922036

Publisher: Cambridge University Press

Print publication: 19 November 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Element purchase

Temporarily unavailable

References

Ash, M., Herndon, T., & Pollin, R. (2013). Does high profile debt consistently stifle economic growth? A critique of Reinhart and Rogo. Political Economy Research Institute Working Paper Series, no. 322.Google Scholar

Baggerly, K. A. & Coombes, K. R. (2009). Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology. Annals of Applied Statistics, 3, 1309–1344. https://doi.org/10.1214/09-AOAS291 Google Scholar

Berkeley, G. (1710). A Treatise Concerning the Principles of Human Knowledge, London: Printed for Jacob Tonson.Google Scholar

Bode, K. (2018). A World of Fiction: Digital Collections and the Future of Literary History, Ann Arbor: University of Michigan Press.Google Scholar

Bode, K.(2020). Why you can’t model away bias. Modern Language Quarterly, 81(1), 95–124. https://doi.org/10.1215/00267929–7933102 Google Scholar

Bossaerts, P., Camerer, C., Fiorillo, C. D. et al. (2008). Explicit neural signals reflecting reward uncertainty. Phil. Trans. R. Soc. B., 363, 3801–3811. https://doi.org/10.1098/rstb.2008.0152 Google Scholar

Bourrier, K. & Thelwall, M. (2020). The social lives of books: Reading Victorian literature on Goodreads. Journal of Cultural Analytics. https://doi.org/10.22148/001c.12049 Google Scholar

Bowersock, G. W. (2008). Introduction. In Valla, L, On the Donation of Constantine, trans. G. W. Bowersock, Cambridge, MA: Harvard University Press.Google Scholar

Buurma, R. S. & Heffernan, L. (2012). The common reader and the archival classroom: Disciplinary history for the twenty-first century. New Literary History, 43(1), 113–135. https://doi.org/10.1353/nlh.2012.0005 CrossRef Google Scholar

Camporeale, S. I. (1996). Lorenzo Valla’s oratio on the pseudo-donation of Constantine: Dissent and innovation in early Renaissance humanism. Journal of the History of Ideas, 57(1), 9–26. https://doi.org/10.2307/3653880 Google Scholar

Contessa, G. (2007). Scientific representation, interpretation, and surrogative reasoning. Philosophy of Science, 74(1), 48–68. https://doi.org/10.1086/519478 Google Scholar

Degaetano-Ortlieb, S., Hannah, K., Khamis, A., & Teich, E. (2018). An information-theoretic approach to modeling diachronic change in scientific English. From Data to Evidence in English Language Research. Suhr, C., Nevalainen, T., & Taavitsainen, I. (Eds.), Leiden: Brill. 258–281.Google Scholar

Degaetano-Ortlieb, S. & Piper, A. (2019). The scientization of literary study. Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 18–28. https://doi.org/10.18653/v1/W19-2503 Google Scholar

Degaetano-Ortlieb, S. & Teich, E. (2017). Modeling intra-textual variation with entropy and surprisal: Topical vs. stylistic patterns. Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 68–77. https://doi.org/10.18653/v1/W17-2209 Google Scholar

Devlin, J., Chang, M-W., Lee, K., Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of NAACL-HLT, 4171–4186.Google Scholar

Dilthey, W. (1922). Das Erlebnis und die Dichtung. Lessing, Goethe, Novalis, Hölderlin, 8th ed., Wiesbaden: Springer.Google Scholar

Earp, B. D. & Trafimow, D. (2015). Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2015.00621 Google Scholar

Eco, U. (1989). The Open Work, Cambridge, MA: Harvard University Press.Google Scholar

Emre, M. (2018). Paraliterary: The Making of Bad Readers in Postwar America, Chicago: University of Chicago Press.Google Scholar

Erickson, L. & Thiessen, E. D. (2015). Statistical learning of language: Theory, validity, and predictions of a statistical learning account of language acquisition. Developmental Review, 37, 66–108. https://doi.org/10.1016/j.dr.2015.05.002 Google Scholar

Evans, J. A. & Foster, J. G. (2011). Metaknowledge. Science, 331(6018), 721–725. https://doi.org/10.1126/science.1201765 CrossRef Google Scholar PubMed

Evans, J. A. & McMahan, P. (2018). Ambiguity and engagement. American Journal of Sociology, 124(3), 860–912. https://doi.org/10.1086/701298 Google Scholar

Felski, R. (2008). Uses of Literature, Malden, MA: Blackwell Publishing.Google Scholar

Felski, R.(2015). The Limits of Critique, Chicago: University of Chicago Press.Google Scholar

Felski, R.ed. (2017). Special issue: For example. New Literary History, 48(3), 415–608.Google Scholar

Fish, S. E. (1976). Interpreting the “Variorum.” Critical Inquiry, 2(3), 465–485. https://doi.org/10.1086/447852 Google Scholar

Fohrmann, J. & Voßkamp, W. eds. (1994). Wissenschaftsgeschichte der Germanistik im 19. Jahrhundert, Stuttgart: Metzler.Google Scholar

Foster, E. D. & Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association: JMLA, 105(2), 203–206. https://doi.org/10.5195/jmla.2017.88 Google Scholar

Gadamer, H.-G. (1990). Wahrheit und Methode: Grundzüge einer philosophischen Hermeneutik, Tübingen: Mohr.Google Scholar

Gius, E., Reiter, N., & Willand, M. (2019). Foreword to the Special Issue: “A shared task for the digital humanities: Annotating narrative levels,” Journal of Cultural Analytics. https://doi.org/10.22148/16.047 Google Scholar

Gladen, B. & Rogan, W .J. (1978). Estimating prevalence from the results of a screening test. American Journal of Epidemiology, 107(1), 71–76. https://doi.org/10.1093/Oxfordjournals.Aje.A112510 Google Scholar

Goldhill, S. (2017). The limits of the case study: Exemplarity and the reception of classical literature. New Literary History, 48(3), 415–435. https://doi.org/10.1353/nlh.2017.0023 Google Scholar

Goldstone, A. & Underwood, T. (2014). The quiet transformations of literary studies: What thirteen thousand scholars could tell us. New Literary History, 45(3), 359–384. https://doi.org/10.7282/T3222RZT Google Scholar

Grafton, A. (1994). Defenders of the Text: The Traditions of Scholarship in an Age of Science, 1450–1800, Cambridge, MA: Harvard University Press.Google Scholar

Grafton, A. & Jardine, L. (1986). From Humanism to the Humanities: Education and the Liberal Arts in Fifteenth- and Sixteenth-Century Europe, Cambridge, MA: Harvard University Press.Google Scholar

Guillory, J. (1993). Cultural Capital: The Problem of Literary Canon Formation, Chicago: University of Chicago Press.Google Scholar

Guo, P., Ma, Z., & Stodden, V. (2013). Toward reproducible computational research: An empirical analysis of data and code policy adoption by journals. PLOS ONE, 8(6), e67111. https://doi.org/10.1371/journal.pone.0067111 Google Scholar

Hacking, I. (1999). The Social Construction of What? Cambridge, MA: Harvard University Press.Google Scholar

Harada, T., Koeda, T., Ohno, K. et al. (2012). Distinction between the literal and intended meanings of sentences: A functional magnetic resonance imaging study of metaphor and sarcasm. Cortex, 48(5), 563–583. https://doi.org/10.1016/j.cortex.2011.01.004 Google Scholar

Hayot, E. (2016). Against historicist fundamentalism. PMLA, 131(5), 1414–1422. https://doi.org/10.1632/pmla.2016.131.5.1414 Google Scholar

Horvat, M., Mlinarić, A., & Smolčić, V. S. (2017). Dealing with the positive publication bias: Why you should really publish your negative results. Biochemia medica, 27(3), 030201. https://doi.org/10.11613/BM.2017.030201 Google Scholar

Hunter, M. C. (2010). Experiment, theory, representation: Robert Hooke’s material models. In Frigg, R. & Springer, H., Beyond Mimesis and Convention: Representation in Art and Science, Dordrecht: Springer Science+Business Media, 193–219.Google Scholar

Iqbal, F., Binsalleeh, H., Fung, B. C. M., & Debbabi, M. (2013). A unified data mining solution for authorship analysis in anonymous textual communications. Information Sciences, 231, 98–112. https://doi.org/10.1016/j.ins.2011.03.006 CrossRef Google Scholar

Jameson, F. (1986). Third world literature in the era of multi-national capitalism. Social Text, 15, 65–88. https://doi.org/10.2307/466493 Google Scholar

Kant, I. (1996) Kritik der Urteilskraft, Frankfurt/Main: Suhrkamp.Google Scholar

Kelley, D. R. (1970). Foundations of Modern Historical Scholarship: Language, Law, and History in the French Renaissance, New York: Columbia University Press, 19–46.Google Scholar

Kopp, D. & Wegmann, N. (1988). Wenige wissen noch, wie Leser lieset.’ Anmerkungen zum Thema: Lesen und Geschwindigkeit. In Oellers, N., ed. Germanistik und Deutschunterricht im Zeitalter der Technologie: Selbstbestimmung und Anpassung, Tübingen: Niemeyer, 92–104.Google Scholar

Laplace, P. S. (1841). Mémoire sur la probabilité des causes par les évènements. Oeuvres complètes, 8, 27–65.Google Scholar

Lerer, S. (2002). Error and the Academic Self: The Scholarly Imagination, Medieval to Modern, New York: Columbia University Press.Google Scholar

Levine, C. (2017). Model thinking: Generalization, political form, and the common good. New Literary History, 48(4), 633–653. https://doi.org/10.1353/nlh.2017.0033 Google Scholar

Locke, J. (1824). Essay Concerning Human Understanding, 12th ed., London: Rivington.Google Scholar

Martinez-Fuentes, C. & Vicente-Saez, R. (2018). Open Science now: A systematic literature review for an integrated definition. Journal of Business Research, 88, 428–436. https://doi.org/10.1016/j.jbusres.2017.12.043 Google Scholar

Nauta, L. (2009). In Defense of Common Sense: Lorenzo Valla’s Humanist Critique of Scholastic Philosophy, Cambridge, MA: Harvard University Press.Google Scholar

Nelson, L. D., Simmons, J. P. & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632 Google Scholar

Open Science Collaboration. (2012). An open, large-scale, collaborative effort to estimate the reproducibility of psychological science. Perspectives on Psychological Science, 7(6), 657–660. https://doi.org/10.1177/1745691612462588 Google Scholar

Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349, 6251. https://doi.org/10.1126/science.aac4716 Google Scholar

Payne, G. & Williams, M. (2005). Generalization in qualitative research. Sociology, 39(2), 295–314. https://doi.org/10.1177/0038038505050540 Google Scholar

Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L. (2018). Deep contextualized word representations. Proceedings of NAACL-HLT, 2227–2237. https://doi.org/10.18653/v1/N18-1202 Google Scholar

Piper, A. (2017). Think small: On literary modeling. PMLA, 132(3), 651–658. https://doi.org/10.1632/pmla.2017.132.3.651 Google Scholar

Piper, A. & Sachs, J. (2018). Technique and the time of reading. PMLA, 133(5), 1259–1267. http://doi.org/10.1632/pmla.2018.133.5.1259 Google Scholar

Piper, A. & Wellmon, C. (2017). Publication, power and patronage: On inequality and academic publishing. Critical Inquiry. http://criticalinquiry.uchicago.edu/publication_power_and_patronage_on_inequality_and_ academic_publishing/Google Scholar

Polanyi, M. (2009). The Tacit Dimension, Chicago: University of Chicago Press.Google Scholar

Popper, K. (1935). Logik der Forschung. Zur Erkenntnistheorie der modernen Naturwissenschaft, Wien: Springer.Google Scholar

Porter, T. M. (1986). The Rise of Statistical Thinking, 1820–1900, Princeton: Princeton University Press.CrossRef Google Scholar

Reichenbach, H. (1930). Kausalität und Wahrscheinlichkeit. Erkenntnis, 1, 158–188. https://doi.org/10.1007/BF00208615 Google Scholar

Reynolds, L. D. & Wilson, N. G. (1991). Scribes and Scholars: A Guide to the Transmission of Greek and Latin Literature, 3rd ed., Oxford: Clarendon Press.Google Scholar

Rogan, W. J. & Gladen, B. (1978). Estimating prevalence from the results of a screening test. American Journal of Epidemiology, 107(1), 71–76.Google Scholar

Rogers, D. (1650). Naaman the Syrian, His Disease and Cure, London: Printed by Th. Harper for Philip Nevil.Google Scholar

Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1, 206–215. https://doi.org/10.1038/s42256-019-0048-x Google Scholar

Schleiermacher, F. (1977). Hermeneutik und Kritik, Frankfurt/Main: Suhrkamp.Google Scholar

Schultz, W., Preuschoff, K., Camerer, C., Hsu, M., Fiorillo, C.D., Tobler, P.N. and Bossaerts, P. (2008). Explicit neural signals reflecting reward uncertainty. Phil. Trans. R. Soc. B, 363: 3801–3811. https://doi.org/10.1098/rstb.2008.0152 Google Scholar

Shapin, S. (1994). A Social History of Truth: Civility and Science in Seventeenth-Century England. Chicago: University of Chicago Press.Google Scholar

So, R. J. (2017). All models are wrong. PMLA, 132(3), 668–673. https://doi.org/10.1632/pmla.2017.132.3.668 Google Scholar

Spellman, B. (2015). A short (personal) future history of revolution 2.0. Perspectives on Psychological Science, 10(6), 886–899. https://doi.org/10.1177/1745691615609918 CrossRef Google Scholar

Stigler, S. (1986). The History of Statistics Before 1900, Cambridge, MA: Belknap Press.Google Scholar

Thiessen, E. D. (2017). What’s statistical about learning? Insights from modelling statistical learning as a set of memory processes. Phil. Trans. R. Soc. B., 372, 20160056.Google Scholar

Uchiyama, H. T., Saito, D. N., Tanabe, H. C., Harada, T., Seki, A., Ohno, K., Koeda, T., & Sadato, N. (2012). Distinction between the literal and intended meanings of sentences: a functional magnetic resonance imaging study of metaphor and sarcasm. Cortex, 48(5), 563–583. https://doi.org/10.1016/j.cortex.2011.01.004 Google Scholar

Underwood, T. (2016). Why Literary Periods Mattered, Stanford: Stanford University Press.Google Scholar

Valla, L. (2008). On the Donation of Constantine, trans. G. W. Bowersock. Cambridge,MA: Harvard University Press.Google Scholar

Viswanathan, G. (1989). Masks of Conquest: Literary Study and British Rule in India, New York: Columbia University Press.Google Scholar

Wellmon, C. & Reitter, P. (forthcoming 2021). Permanent Crisis: The Humanities in a Disenchanted Age, Chicago: University of Chicago Press.Google Scholar

Williams, M. (2000). Interpretivism and generalisation. Sociology, 34(2), 209–224. https://doi.org/10.1177/S0038038500000146 Google Scholar

Willinsky, J. (2006). The Access Principle: The Case for Open Access to Research and Scholarship, Cambridge, MA: MIT Press.Google Scholar

Wolfson, S. J. (1986). The Questioning Presence: Wordsworth, Keats, and the Interrogative Mode in Romantic Poetry, Ithaca: Cornell University Press.Google Scholar

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this Element is currently unknown and may be updated in the future.

Element contents

Can We Be Wrong? The Problem of Textual Evidence in a Time of Data

Summary

Keywords

Information

Access options

Element purchase

Temporarily unavailable

References

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Save element to Kindle

Save element to Dropbox

Save element to Google Drive