Symbolic Natural Language Processing

M. Lothaire

doi:10.1017/CBO9781107341005.004

Chapter 3 - Symbolic Natural Language Processing

Published online by Cambridge University Press: 05 June 2013

M. Lothaire

Book contents

Get access

Summary

Introduction

Fundamental notions of combinatorics on words underlie natural language processing. This is not surprising, since combinatorics on words can be seen as the formal study of sets of strings, and sets of strings are fundamental objects in language processing.

Indeed, language processing is obviously a matter of strings. A text or a discourse is a sequence of sentences; a sentence is a sequence of words; a word is a sequence of letters. The most universal levels are those of sentence, word, and letter (or phoneme), but intermediate levels exist, and can be crucial in some languages, between word and letter: a level of morphological elements (e.g. suffixes), and the level of syllables. The discovery of this piling up of levels, and in particular of word level and phoneme level, delighted structuralist linguists in the twentieth century. They termed this inherent, universal feature of human language “double articulation”.

It is a little more intricate to see how sets of strings are involved. There are two main reasons. First, at a point in a linguistic flow of data being processed, you must be able to predict the set of possible continuations after what is already known, or at least to expect any continuation among some set of strings that depends on the language. Second, natural languages are ambiguous, that is a written or spoken portion of text can often be understood or analysed in several ways, and the analyses are handled as a set of strings as long as they cannot be reduced to a single analysis.

Information

Type: Chapter
Information: Applied Combinatorics on Words , pp. 164 - 209

DOI: https://doi.org/10.1017/CBO9781107341005.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2005

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.