An analysis of property inference methods

Alex Rosenfeld; Katrin Erk

doi:10.1017/S1351324921000267

An analysis of property inference methods

Published online by Cambridge University Press: 14 January 2022

Alex Rosenfeld and

Katrin Erk

Show author details

Alex Rosenfeld*: Affiliation:
Intelligent Automation, Inc., Rockville, MD 20855, USA
Katrin Erk: Affiliation:
Department of Linguistics, The University of Texas at Austin, Austin, TX 78705, USA
*: *Corresponding author. E-mail: alexbrosenfeld@gmail.com

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Property inference involves predicting properties for a word from its distributional representation. We focus on human-generated resources that link words to their properties and on the task of predicting these properties for unseen words. We introduce the use of label propagation, a semi-supervised machine learning approach, for this task and, in the first systematic study of models for this task, find that label propagation achieves state-of-the-art results. For more variety in the kinds of properties tested, we introduce two new property datasets.

Keywords

Semantics Machine learning Lexical knowledge acquisition

Type: Article
Information: Natural Language Engineering , Volume 29 , Issue 2 , March 2023 , pp. 201 - 227

DOI: https://doi.org/10.1017/S1351324921000267 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

†

Research performed while attending The University of Texas at Austin.

References

Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M. and Soroa, A. (2009). A study on similarity and relatedness using distributional and WordNet-based approaches. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Colorado. Association for Computational Linguistics, pp. 19–27.CrossRef Google Scholar

Almuhareb, A. and Poesio, M. (2004). Attribute-based and value-based clustering: an evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain. Association for Computational Linguistics, pp. 158–165.Google Scholar

Baroni, M., Bernardini, S., Ferraresi, A. and Zanchetta, E. (2009). The wacky wide web: a collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation 43(3), 209–226.CrossRef Google Scholar

Baroni, M. and Lenci, A. (2010). Distributional memory: a general framework for corpus-based semantics. Computational Linguistics 36(4), 673–721.CrossRef Google Scholar

Baroni, M. and Lenci, A. (2011). How we BLESSed distributional semantic evaluation. In Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, Edinburgh, UK. Association for Computational Linguistics, pp. 1–10.Google Scholar

Bernier-Colborne, G. and Barrière, C. (2018). CRIM at SemEval-2018 task 9: a hybrid approach to hypernym discovery. In Proceedings of The 12th International Workshop on Semantic Evaluation, New Orleans, Louisiana. Association for Computational Linguistics, pp. 725–731.CrossRef Google Scholar

Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T. and Taylor, J. (2008). Freebase: a collaboratively created graph database for structuring human knowledge. In Wang J.T. (ed), Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10–12, 2008. ACM, pp. 1247–1250.CrossRef Google Scholar

Bruni, E., Boleda, G., Baroni, M. and Tran, N.-K. (2012). Distributional semantics in technicolor. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jeju Island, Korea. Association for Computational Linguistics, pp. 136–145.Google Scholar

Clark, S. (2015). Vector space models of lexical meaning. In The Handbook of Contemporary Semantic Theory, Chapter 16. John Wiley & Sons, Ltd., pp. 493–522.CrossRef Google Scholar

Derby, S., Miller, P. and Devereux, B. (2019). Feature2Vec: distributional semantic modelling of human property knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China. Association for Computational Linguistics, pp. 5853–5859.CrossRef Google Scholar

Devereux, B., Pilkington, N., Poibeau, T. and Korhonen, A. (2009). Towards unrestricted, large-scale acquisition of feature-based conceptual representations from corpus data. Research on Language and Computation 7(2–4), 137–170.CrossRef Google Scholar

Devereux, B.J., Tyler, L.K., Geertzen, J. and Randall, B. (2014). The Centre for Speech, Language and the Brain (CSLB) concept property norms. Behavior Research Methods 46(4), 1119–1127.CrossRef Google Scholar PubMed

Devlin, J.T., Gonnerman, L.M., Andersen, E.S. and Seidenberg, M.S. (1998). Category-specific semantic deficits in focal and widespread brain damage: a computational account. Journal of Cognitive Neuroscience 10(1), 77–94.CrossRef Google Scholar

Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A.J. and Vapnik, V. (1996). Support vector regression machines. In Mozer M., Jordan M.I. and Petsche, T. (eds), Advances in Neural Information Processing Systems 9, NIPS, Denver, CO, USA, December 2–5, 1996. MIT Press, pp. 155–161.Google Scholar

Erk, K. (2012). Vector space models of word meaning and phrase meaning: a survey. Language and Linguistics Compass 6(10), 635–653.CrossRef Google Scholar

Fagarasan, L., Vecchi, E.M. and Clark, S. (2015). From distributional semantics to feature norms: grounding semantic models in human perceptual data. In Proceedings of the 11th International Conference on Computational Semantics, London, UK. Association for Computational Linguistics, pp. 52–57.Google Scholar

Fellbaum, C. (ed) (1998). WordNet: An Electronic Lexical Database . Language, Speech, and Communication. Cambridge, MA: MIT Press.Google Scholar

Feng, Y. and Lapata, M. (2010). Visual information in semantic representation. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, California. Association for Computational Linguistics, pp. 91–99.Google Scholar

Gärdenfors, P. (2014). The Geometry of Meaning: Semantics Based on Conceptual Spaces. Cambridge, MA:MIT Press.CrossRef Google Scholar

Garrard, P., Lambon Ralph, M.A., Hodges, J.R. and Patterson, K. (2001). Prototypicality, distinctiveness, and intercorrelation: analyses of the semantic attributes of living and nonliving concepts. Cognitive Neuropsychology 18(2), 125–174.CrossRef Google Scholar PubMed

Graff, D., Kong, J., Chen, K. and Maeda, K. (2003). English gigaword. Linguistic Data Consortium, Philadelphia 4(1), 34.Google Scholar

Gupta, A., Boleda, G., Baroni, M. and Padó, S. (2015). Distributional vectors encode referential attributes. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. Association for Computational Linguistics, pp. 12–21.CrossRef Google Scholar

Herbelot, A. (2013). What is in a text, what isn’t, and what this has to do with lexical semantics. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers, Potsdam, Germany. Association for Computational Linguistics, pp. 321–327.Google Scholar

Herbelot, A. and Vecchi, E.M. (2015). Building a shared world: mapping distributional to model-theoretic semantic spaces. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. Association for Computational Linguistics, pp. 22–32.CrossRef Google Scholar

Herbelot, A. and Vecchi, E.M. (2016b). Many speakers, many worlds: interannotator variations in the quantification of feature norms. In Linguistic Issues in Language Technology, Volume 13, 2016. CSLI Publications.CrossRef Google Scholar

Hintzman, D. (1986). “Schema abstraction” in a multiple-trace memory model. Psychological Review 93(4), 411–428.CrossRef Google Scholar

Hintzman, D.L. (1988). Judgments of frequency and recognition memory in a multiple-trace memory model. Psychological Review 95(4), 528.CrossRef Google Scholar

Hsu, C.-W., Chang, C.-C. and Lin, C.-J. (2003). A practical guide to support vector classification. Technical report, Department of Computer Science and Information Engineering, National Taiwan University.Google Scholar

Johns, B.T. and Jones, M.N. (2012). Perceptual inference through global lexical similarity. Topics in Cognitive Science 4(1), 103–120.CrossRef Google Scholar PubMed

Jurafsky, D. and Martin, J.H. (2009). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd Edn. Prentice Hall Series in Artificial Intelligence. Prentice Hall, Pearson Education International.Google Scholar

Kilgarriff, A. (1997). Putting frequencies in the dictionary. International Journal of Lexicography 10(2), 135–155.CrossRef Google Scholar

Langone, H., Haskell, B.R. and Miller, G.A. (2004). Annotating WordNet. In Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004, Boston, Massachusetts, USA. Association for Computational Linguistics, pp. 63–69.Google Scholar

Lazaridou, A., Bruni, E. and Baroni, M. (2014). Is this a wampimuk? cross-modal mapping between distributional semantics and the visual world. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland. Association for Computational Linguistics, pp. 1403–1414.CrossRef Google Scholar

Levy, O. and Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. In Ghahramani Z., Welling M., Cortes C., Lawrence N.D. and Weinberger K.Q. (eds), Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp. 2177–2185.Google Scholar

McRae, K., Cree, G.S., Seidenberg, M.S. and McNorgan, C. (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods 37(4), 547–559.CrossRef Google Scholar PubMed

McRae, K., Cree, G.S., Westmacott, R. and Sa, V.R.D. (1999). Further evidence for feature correlations in semantic memory. Canadian Journal of Experimental Psychology = Revue canadienne de psychologie expérimentale 53(4), 360.CrossRef Google Scholar PubMed

McRae, K., De Sa, V.R. and Seidenberg, M.S. (1997). On the nature and scope of featural representations of word meaning. Journal of Experimental Psychology: General 126(2), 99.CrossRef Google Scholar PubMed

Miller, G.A., Leacock, C., Tengi, R. and Bunker, R.T. (1993). A semantic concordance. In Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21–24, 1993.Google Scholar

Montefinese, M., Zannino, G.D. and Ambrosini, E. (2015). Semantic similarity between old and new items produces false alarms in recognition memory. Psychological Research 79(5), 785–794.CrossRef Google Scholar PubMed

Murphy, G. (2004). The Big Book of Concepts. A Bradford Book. Cambridge, MA: MIT Press.Google Scholar

Ng, K.S. (2013). A simple explanation of partial least squares. Technical report, The Australian National University.Google Scholar

Nickel, M. and Kiela, D. (2017). Poincaré embeddings for learning hierarchical representations. In Guyon I., von Luxburg U., Bengio S., Wallach H.M., Fergus R., Vishwanathan S.V.N. and Garnett R. (eds), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 6338–6347.Google Scholar

Noraset, T., Liang, C., Birnbaum, L. and Downey, D. (2017). Definition modeling: learning to define word embeddings in natural language. In Singh S.P. and Markovitch S. (eds), Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, USA. AAAI Press, pp. 3259–3266.Google Scholar

Pinter, Y. and Eisenstein, J. (2018). Predicting semantic relations using global graph properties. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics, pp. 1741–1751.CrossRef Google Scholar

Randall, B., Moss, H.E., Rodd, J.M., Greer, M. and Tyler, L.K. (2004). Distinctiveness and correlation in conceptual structure: behavioral and computational studies. Journal of Experimental Psychology: Learning, Memory, and Cognition 30(2), 393.Google Scholar PubMed

Roller, S., Erk, K. and Boleda, G. (2014). Inclusive yet selective: supervised distributional hypernymy detection. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin, Ireland. Dublin City University and Association for Computational Linguistics, pp. 1025–1036.Google Scholar

Rosipal, R. and Trejo, L.J. (2001). Kernel partial least squares regression in reproducing kernel hilbert space. The Journal of Machine Learning Research 2, 97–123.Google Scholar

Rothe, S. and Schütze, H. (2015). AutoExtend: extending word embeddings to embeddings for synsets and lexemes. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China. Association for Computational Linguistics, pp. 1793–1803.CrossRef Google Scholar

Rubinstein, D., Levi, E., Schwartz, R. and Rappoport, A. (2015). How well do distributional models capture different types of semantic knowledge? In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Beijing, China. Association for Computational Linguistics, pp. 726–730.Google Scholar

Semin, G.R. and Fiedler, K. (1988). The cognitive functions of linguistic categories in describing persons: social cognition and language. Journal of Personality and Social Psychology 54(4), 558.CrossRef Google Scholar

Stone, P. (1997). Thematic text analysis: new agendas for analyzing text content. In Roberts C. (ed), Text Analysis for the Social Sciences. Mahwah, NJ: Lawerence Erlbaum Associates.Google Scholar

Talukdar, P.P. and Crammer, K. (2009). New regularized algorithms for transductive learning. In Buntine W.L., Grobelnik M., Mladenic D. and Shawe-Taylor J. (eds), Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, Bled, Slovenia, September 7–11, 2009, Proceedings, Part II, vol. 5782. Lecture Notes in Computer Science. Springer, pp. 442–457.CrossRef Google Scholar

The British National Corpus, Version 3 (BNC XML Edition). (2007). Distributed by Bodleian Libraries, University of Oxford, on behalf of the BNC Consortium.Google Scholar

Turney, P.D. and Pantel, P. (2010). From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research 37(1), 141–188.CrossRef Google Scholar

Turton, J., Vinson, D. and Smith, R. (2020). Extrapolating binder style word embeddings to new words. In Proceedings of the Second Workshop on Linguistic and Neurocognitive Resources, Marseille, France. European Language Resources Association, pp. 1–8.Google Scholar

Tyler, L.K., Moss, H.E., Durrant-Peatfield, M. and Levy, J. (2000). Conceptual structure and the structure of concepts: a distributed account of category-specific deficits. Brain and Language 75(2), 195–231.CrossRef Google Scholar PubMed

Ustalov, D., Arefyev, N., Biemann, C. and Panchenko, A. (2017). Negative sampling improves hypernymy extraction based on projection learning. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain. Association for Computational Linguistics, pp. 543–550.CrossRef Google Scholar

Vieth, H.E., McMahon, K.L. and de Zubicaray, G.I. (2014). The roles of shared vs. distinctive conceptual features in lexical access. Frontiers in Psychology 5, 1014.CrossRef Google Scholar PubMed

Vinson, D. and Vigliocco, G. (2002). A semantic analysis of noun-verb dissociation in aphasia. Journal of Neurolinguistics 15, 317–351.CrossRef Google Scholar

Vinson, D.P. and Vigliocco, G. (2008). Semantic feature production norms for a large set of objects and events. Behavior Research Methods 40(1), 183–190.CrossRef Google Scholar PubMed

Vinson, D.P., Vigliocco, G., Cappa, S. and Siri, S. (2003). The breakdown of semantic knowledge: insights from a statistical model of meaning representation. Brain and Language 86(3), 347–365.CrossRef Google Scholar PubMed

Vulić, I. and Mrkšić, N. (2018). Specialising word vectors for lexical entailment. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, Louisiana. Association for Computational Linguistics, pp. 1134–1145.Google Scholar

Wing, B. and Baldridge, J. (2011). Simple supervised document geolocation with geodesic grids. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA. Association for Computational Linguistics, pp. 955–964.Google Scholar

Rosenfeld and Erk supplementary material

File 128 KB

Article contents

An analysis of property inference methods

Abstract

Keywords

Access options

Footnotes

References

Rosenfeld and Erk supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests