Hostname: page-component-586b7cd67f-tf8b9 Total loading time: 0 Render date: 2024-11-23T15:53:26.891Z Has data issue: false hasContentIssue false

A state-of-the-art of semantic change computation

Published online by Cambridge University Press:  18 June 2018

XURI TANG*
Affiliation:
School of Foreign Languages, Huazhong University of Science and Technology, Wuhan, China e-mail: xrtang@hust.edu.cn

Abstract

This paper reviews the state-of-the-art of one emergent field in computational linguistics—semantic change computation. It summarizes the literature by proposing a framework that identifies five components in the field: diachronic corpus, diachronic word sense characterization, change modelling, evaluation and data visualization. Despite its potentials, the review shows that current studies are mainly focused on testifying hypotheses of semantic change from theoretical linguistics and that several core issues remain to be tackled: the need of diachronic corpora for languages other than English, the comparison and development of approaches to diachronic word sense characterization and change modelling, the need of comprehensive evaluation data and further exploration of data visualization techniques for hypothesis justification.

Type
Survey Paper
Copyright
Copyright © Cambridge University Press 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The author is much obliged to the three anonymous reviewers for their inspiring comments that have helped improve the paper.s readability and comprehensiveness. This research is supported by the Fund of Chinese Natural Science (Grant 61772278) and Innovation Fund of Huazhong University of Science and Technology (Grant 2018WKZDJC003).

References

Agirre, E., and Soroa, A. 2007. UBC-AS: a graph based unsupervised system for induction and classification. In Paper presented at the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague.Google Scholar
Andersen, H. 1989. Understanding linguistic innovations. In Breivik, L. E. and Jahr, E. H. (eds.), Language Change: Contributions to the Study of Its Causes, pp. 528. Berlin: Mouton de Gruyter.Google Scholar
Bailey, C.-J. N. 1973. Variation and Linguistic Theory: Center for Applied Linguistics, Arlington, VA: Center for Applied Linguistics.Google Scholar
Beckner, C., Ellis, N. C., Blythe, R., Holland, H., Bybee, J., Ke, J., Christiansen, M. H., Larsen-Freeman, D., Croft, W., and Schoenemann, T., 2009. Language is a complex adaptive system: position paper. Language Learning 59 (Suppl. 1): 126.Google Scholar
Benito, A., Losada, A. G., Therón, R., Dorn, A., Seltmann, M., and Wandl-Vogt, E. 2016. A spatio-temporal visual analysis tool for historical dictionaries. In Paper presented at the 4th International Conference on Technological Ecosystems for Enhancing Multiculturality, New York, NY, USA.Google Scholar
Bennett, A., Baldwin, T., Lau, J. H., McCarthy, D., and Bond, F. 2016. LexSemTM: a semantic dataset based on all-words unsupervised sense distribution learning. In Paper presented at the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), Berlin, Germany.Google Scholar
Blank, A., and Koch, P. 1999. Introduction: historical semantics and cognition. In Blank, A. and Koch, P. (eds.), Historical Semantics and Cognition, pp. 116. Berlin: Mouton de Gruyter.Google Scholar
Blei, D. M., and Lafferty, J. D. 2006. Dynamic topic models. In Paper presented at the 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, USA.Google Scholar
Blei, D. M., Ng, A. Y., and Jordan, M. I., 2003. Latent dirichlet allocation. Journal of Machine Learning Research 2 : 9931022.Google Scholar
Bloomfield, L., 1933. Language. New York: Holt, Rinehart and Winston, Inc.Google Scholar
Boussidan, A., and Ploux, S. 2011. Using topic salience and connotational drifts to detect candidates to semantic change. In Paper presented at the 9th International Conference on Computational Semantics, Oxford, United Kingdom.Google Scholar
Broad, C. D., 1938. Examination of McTaggart’s Philosophy (Vol. II). Cambridge, MA: Cambridge University Press.Google Scholar
Brockwell, P. J., and Davis, R. A. 2002. Introduction to Time Series and Forecasting, 2nd ed. New York: Springer.Google Scholar
Bullinaria, J., and Levy, J., 2007. Extracting semantic representations from word co-occurrence statistics: a computational study. Behavior Research Methods 39 (3): 510–26.Google Scholar
Bullinaria, J., and Levy, J., 2012. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD. Behavior Research Methods 44 (3): 890907.Google Scholar
Cao, Y., Huang, L., Ji, H., Chen, X., and Li, J. 2017. Bridge text and knowledge by learning multi-prototype entity mention embedding. In Paper presented at the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada.Google Scholar
Cavallin, K. 2012. Automatic extraction of potential examples of semantic change using lexical sets. In Paper presented at the KONVENS 2012 Vienna.Google Scholar
Cook, P., Lau, J. H., McCarthy, D., and Baldwin, T. 2014. Novel word-sense identification. In Paper presented at the 25th International Conference on Computational Linguistics, Dublin, Ireland.Google Scholar
Crystal, D. 2006. Language and the Internet, 2nd ed. New York: Cambridge University Press.Google Scholar
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R., 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41 (6): 391407.Google Scholar
Dubossarsky, H., Grossman, E., and Weinshall, D. 2017. Outta control: laws of semantic change and inherent biases in word representation models. In Paper presented at the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark.Google Scholar
Dubossarsky, H., Tsvetkov, Y., Dyer, C., Weinshall, D., and Grossman, E. 2015. A bottom up approach to category mapping and meaning change. In Paper presented at the NetWordS 2015, Pisa, Italy.Google Scholar
Erk, K. 2006. Unknown word sense detection as outlier detection. In Paper presented at the 2006 Human Language Technology Conference of the North American Chapter of the ACL, New York, NY.Google Scholar
Fellbaum, C., 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.Google Scholar
Firth, J. R. 1957. A Synopsis of Linguistic Theory, 1930–1955 Studies in Linguistic Analysis, pp. 132. Oxford: Blackwell.Google Scholar
Fischer, R., 1998. Lexical Change in Present-day English: A Corpus-based Study of the Motivation, Institutionalization, and Productivity of Creative Neologisms. Tübingen: Gunter Narr Verlag.Google Scholar
Fortson, B. W. 2008. An approach to semantic change. In Joseph, B. D. and Janda, R. D. (eds.), The Handbook of Historical Linguistics, pp. 648666. Malden, MA: Blackwell Publishing Ltd.Google Scholar
Frermann, L., and Lapata, M., 2016. A Bayesian model of diachronic meaning change. Transactions of the Association for Computational Linguistics 4 : 3145.Google Scholar
Geach, P. T., 1969. God and the Soul. London: Routledge and Kegan Paul.Google Scholar
Geach, P. T., 1979. Truth, Love, and Immortality: An Introduction to McTaggart’s Philosophy. Michigan: Hutchinson.Google Scholar
Geeraerts, D., 1983. Reclassifying semantic change. Quaderni di Semantica 4 : 217–40.Google Scholar
Geeraerts, D., 1997. Diachronic Prototype Semantics: A Contribution to Historical Lexicology. Oxford, USA: Oxford University Press.Google Scholar
Geeraerts, D. 1999. Diachronic prototype semantics: a digest. In Blank, A. and Koch, P. (eds.), Historical Semantics and Cognition, pp. 91108. Berlin: De Ruyter Mouton.Google Scholar
Goldberg, Y. and Orwant, J. 2013. A dataset of syntactic-ngrams over time from a very large corpus of English books. In Paper presented at the Joint Conference on Lexical and Computational Semantics, Atlanta, GA, USA.Google Scholar
Gulordava, K. and Baroni, M. 2011. A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. In Paper presented at the GEMS 2011 Workshop on Geometrical Models of Natural Language Semantics.Google Scholar
Hale, M., 2007. Historical Linguistics: Theory and Method. Oxford: Blackwell.Google Scholar
Hamilton, W. L., Leskovec, J., and Dan, J. 2016. Diachronic word embeddings reveal statistical laws of semantic change. In Paper presented at the 54th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Harris, Z. S., 1954. Distributional structure. Word 10 (2–3): 146–62.Google Scholar
Heine, B., Claudi, U., and Hünnemeyer, F., 1991. Grammaticalization: A Conceptual Framework. Chicago, IL: University of Chicago Press.Google Scholar
Heylen, K., Wielfaert, T., Speelman, D., and Geeraerts, D., 2015. Monitoring polysemy: word space models as a tool for large-scale lexical semantic analysis. Lingua 157 : 153–72.Google Scholar
Hilpert, M., and Gries, S. T., 2009. Assessing frequency changes in multi-stage diachronic corpora: applications for historical corpus linguistics and the study of language acquisition. Literary and Linguistic Computing 34 (4): 385401.Google Scholar
Hilpert, M., and Perek, F., 2015. Meaning change in a petri dish: constructions, semantic vector spaces, and motion charts. Linguistics Vanguard 1 (1): 339–50.Google Scholar
Hollman, W. B. 2009. Semantic change. In Culpeper, J., Katamba, F., Kerswill, P., and McEnery, T. (eds.), English Language: Description, Variation and Context, pp. 301–13. Basingstoke: Palgrave.Google Scholar
Jatowt, A., and Duh, K. 2014. A framework for analyzing semantic change of words across time. In Paper presented at the 14th ACM/IEEE-CS Joint Conference on Digital Libraries, London, United Kingdom.Google Scholar
Kintsch, W., 2001. Predication. Cognitive Science 25 (2): 173202.Google Scholar
Korkontzelos, I., and Manandhar, S. 2010. UoY: Graphs of unambiguous vertices for word sense induction and disambiguation. In Paper presented at the 5th International Workshop on Semantic Evaluation, Uppsala.Google Scholar
Kroch, A., 1989. Reflexes of Grammar in patterns of language change. Language Variation and Change 1 (3): 199244.Google Scholar
Kulkarni, V., Alrfou, R., Perozzi, B., and Skiena, S. 2015. Statistically significant detection of linguistic change. In Paper presented at the 24th International Conference on World Wide Web, Florence, Italy.Google Scholar
Labov, W., 1994. Principles of Linguistic Change: Internal Factors. Oxford: Blackwell.Google Scholar
Landau, S. I. 2001. Dictionaries: The Art and Craft of Lexicography, 2nd ed. Cambridge, MA: Cambridge University Press.Google Scholar
Lau, J. H., Cook, P., McCarthy, D., Gella, S., and Baldwin, T. 2014. Learning word sense distributions, detecting unattested senses and identifying novel senses using topic models. In Paper presented at the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, USA.Google Scholar
Lau, J. H., Cook, P., McCarthy, D., Newman, D., and Baldwin, T. 2012. Word sense induction for novel sense detection. In Paper presented at the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France.Google Scholar
Levy, O., Goldberg, Y., and Dagan, I., 2015. Improving distributional similarity with lessons learned from word embeddings. Bulletin De La Société Botanique De France 75 (3): 552–5.Google Scholar
Lewis, D. K., 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.Google Scholar
Li, J., and Dan, J. 2015. Do multi-sense embeddings improve natural language understanding? In Paper presented at the 2015 Empirical Methods in Natural Language Processing, Lisbon, Portugal.Google Scholar
Lin, D. 1998. Automatic retrieval and clustering of similar words. In Paper presented at the 17th International Conference on Computational Linguistics, Montreal, Quebec, Canada.Google Scholar
Lin, Y., Michel, J. B., Aiden, E. L., Orwant, J., Brockman, W., and Petrov, S. 2012. Syntactic annotations for the google books ngram corpus. In Paper presented at the ACL 2012 System Demonstrations.Google Scholar
Liu, Y., Liu, Z., Chua, T. S., and Sun, M. 2015. Topical word embeddings. In Paper presented at the 29th AAAI Conference on Artificial Intelligence.Google Scholar
Mantia, F. L., Licata, I., and Perconti, P., 2017. Language in Complexity: The Emerging Meaning. Berlin, Heidelberg: Springer.Google Scholar
Marco, A. D., and Navigli, R., 2013. Clustering and diversifying web search results with graph-based word sense induction. Computational Linguistics 39 (3): 709–54.Google Scholar
Massip-Bonet, À. 2013. Language as a complex adaptive system: towards an integrative linguistics. In Massip-Bonet, À. and Bastardas-Boada, A. (eds.), Complexity Perspectives on Language, Communication and Society, pp. 3560. Berlin, Heidelberg: Springer.Google Scholar
Michel, J.-B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Team, T. G. B., Pickett, J. P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. A., and Aiden, E. L., 2011. Quantitative analysis of culture using millions of digitized books. Science 331 (6014): 176–82.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In Paper presented at the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada.Google Scholar
Mitra, S., Mitra, R., Maity, S. K., Riedl, M., Biemann, C., Goyal, P., and Mukherjee, A., 2015. An automatic approach to identify word sense changes in text media across timescales. Natural Language Engineering 21 (5): 773–98.Google Scholar
Mortensen, C. 2016. Change and inconsistency. The Stanford Encyclopedia of Philosophy (Winter 2016 Edition). From https://plato.stanford.edu/archives/win2016/entries/change/.Google Scholar
Nasiruddin, M. 2013. A state of the art of word sense induction: a way towards word sense disambiguation for under-resourced languages. In Paper presented at the TALN-RÉCITAL 2013, Les Sables d’Olonne, France.Google Scholar
Navigli, R., 2009. Word sense disambiguation: a survey. ACM Computing Surveys 41 (2): 169.Google Scholar
Navigli, R., and Ponzetto, S. P., 2012. The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193 : 217–50.Google Scholar
Neuman, Y., Hames, H., and Cohen, Y. 2017. An information-based procedure for measuring semantic change in historical data. Measurement 105, (Suppl. C): 130–5.Google Scholar
Pasini, T., and Navigli, R. 2018. Two knowledge-based methods for high-performance sense distribution learning. In Paper presented at the AAAI 2018, New Orleans.Google Scholar
Pennington, J., Socher, R., and Manning, C. 2014. GloVe: global vectors for word representation. In Paper presented at the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar.Google Scholar
Prévost, N. 2003. The physics of language: towards a phase-transition of language change. Ph. D., Simon Fraser University.Google Scholar
Reisinger, J., and Mooney, R. J. 2010. Multi-prototype vector-space models of word meaning. In Paper presented at the 2010 Conference of the North American Chapter of the Association for Computational Linguistics.Google Scholar
Rohrdantz, C., Hautli, A., Mayer, T., Butt, M., Keim, D. A., and Plank, F. 2011. Towards tracking semantic change by visual analytics. In Paper presented at the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, USA.Google Scholar
Rudolph, M., and Blei, D. 2018. Dynamic Bernoulli embeddings for language evolution. In Paper presented at the WWW 2018: The 2018 Web Conference, Lyon, France.Google Scholar
Sagi, E., Kaufmann, S., and Clark, B. 2009. Semantic density analysis: comparing word meaning across time and phonetic space. In Paper presented at the EACL 2009 Workshop on GEMS: GEometical Models of Natural Language Semantics, Athens, Greece.Google Scholar
Sagi, E., Kaufmann, S., and Clark, B. 2011. Tracing semantic change with latent semantic analysis. In Allan, K. and Robinson, J. A. (eds.), Current Methods in Historical Semantics. Berlin, Germany: Mouton de Gruyter.Google Scholar
Sinclair, J. 2005. Corpus and text: basic principles. In Wynne, M. (ed.), Developing Linguistic Corpora: A Guide to Good Practice, pp. 116. Oxford: Oxbow Books.Google Scholar
Sweetser, E., 1990. From Etymology to Pragmatics: Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press.Google Scholar
Tang, X., Qu, W., and Chen, X., 2016. Semantic change computation: a successive approach. World Wide Web - Internet & Web Information Systems 19 (3): 375415.Google Scholar
Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M., 2006. Hierarchical Dirichlet processes. Journal of the American Statistical Association 101 (476): 1566–81.Google Scholar
Traugott, E. C., and Dasher, R. B., 2002. Regularity in Semantic Change. Cambridge: Cambridge University Press.Google Scholar
Véronis, J., 2004. HyperLex: lexical cartography for information retrieval. Computer Speech & Language 18 (3): 223–52.Google Scholar
Wang, X., and Mccallum, A. 2006. Topics over time: a non-Markov continuous-time model of topical trends. In Paper presented at the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA.Google Scholar
Weaver, W. 1955. Translation. In Locke, W. N. and Booth, D. A. (eds.), Machine Translation of Languages, pp. 1522. Cambridge, MA: MIT Press.Google Scholar
Wijaya, D. T., and Yeniterzi, R. 2011. Understanding semantic change of words over centuries. In Paper presented at the 2011 International Workshop on Detecting and Exploiting Cultural Diversity on the Social Web, Glasgow, Scotland, UK.Google Scholar
Yang, X., and Kemp, C. 2015. A computational evaluation of two laws of semantic change. In Paper presented at the 37th Annual Meeting of the Cognitive Science Society, Austin, TX.Google Scholar
Yao, Z., Sun, Y., Ding, W., Rao, N., and Xiong, H. 2017. Discovery of evolving semantics through dynamic word embedding learning. In Paper presented at the International Conference on Web Search and Data Mining (WSDM-2018).Google Scholar
Zuraw, K. 2006. Language change: probabilistic models. In Brown, E. K. and Anderson, A. (eds.), Encyclopedia of Language & Linguistics. Oxford: Elsevier.Google Scholar