Hostname: page-component-77f85d65b8-lfk5g Total loading time: 0 Render date: 2026-03-26T19:57:54.497Z Has data issue: false hasContentIssue false

Epistemic Markers in the Scientific Discourse

Published online by Cambridge University Press:  22 August 2023

Christophe Malaterre*
Affiliation:
Département de philosophie and Centre interuniversitaire de recherche sur la science et la technologie (CIRST), Université du Québec à Montréal (UQAM), Montréal, Québec, Canada
Martin Léonard
Affiliation:
Département de philosophie and Centre interuniversitaire de recherche sur la science et la technologie (CIRST), Université du Québec à Montréal (UQAM), Montréal, Québec, Canada
*
Corresponding author: Christophe Malaterre; Email: malaterre.christophe@uqam.ca
Rights & Permissions [Opens in a new window]

Abstract

The central role of such epistemic concepts as theory, explanation, model, or mechanism is rarely questioned in philosophy of science. Yet, what is their actual use in the practice of science? Here, we deploy text-mining methods to investigate the usage of 61 epistemic notions in a corpus of full-text articles from the biological and biomedical sciences (N = 73,771). The influence of disciplinary context is also examined by splitting the corpus into subdisciplinary clusters. The results reveal the intricate semantic networks that these concepts actually form in the scientific discourse, not always following our intuitions, at least in some parts of science.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Philosophy of Science Association
Figure 0

Figure 1. Research design. Three major steps of computational methods, from corpus-data preparation to semantic field correlation analyses. (Textual corpus in dark blue, data in light blue, operations in orange, analyses in red.) Figure in color online.

Figure 1

Table 1. The 79 Semantic Fields (Noted in Bold Italic Typeface) Sorted by Type and with Their Respective Terms

Figure 2

Table 2. The Seven Article Clusters (Noted in Small Caps Typeface) with Their Number of Articles and Their Topical Profiles

Figure 3

Figure 2. Average occurrence of the semantic fields per document over the complete corpus. (Semantic fields are grouped and colored by type. Surface area is proportional to the average number of occurrences of terms related to each semantic field; to give an idea of scale: suggest, in the upper right-hand corner, has an average of 5.1 occurrences per article. Numerical values are available in the “Data_for_graphs” file; see Supplementary Information section.) Figure in color online.

Figure 4

Figure 3. Correlation network of the most significant semantic fields over the whole corpus. (Correlations calculated at the paragraph level; for analysis, only correlations of >0.1 were kept; all ps < .001. Colors indicate correlation clusters based on Louvain community detection, node size is proportional to average occurrence in the corpus, and edge thickness is proportional to correlation strength; rendering was done with ForceAtlas2 on Gephi [Bastian et al. 2009]. To give an idea of scale: suggest, in the upper-middle part of the network, has an average of 5.1 occurrences per article. Numerical values are available in the “Data_for_graphs” file; see Supplementary Information section.) Figure in color online.

Figure 5

Figure 4. Correlation network of the most significant semantic fields for the seven disciplinary clusters. (Correlations calculated at the paragraph level; for analysis, only correlations > 0.1 were kept; all ps < .001. Node size is proportional to average occurrence in the cluster; edge thickness is proportional to correlation strength. Subnetwork identification is based on Louvain community detection; subnetwork colors depend on the size of the network, from the largest subnetwork in blue to the smallest subnetwork in brown. Rendering was done with ForceAtlas on Gephi. To give an idea of scale: model has an average of 25.6 occurrences per article in network (a) and 6.1 in network (g). Numerical values are available in the “Data_for_graphs” file; see Supplementary Information section.) Figure in color online.