9 - Applications
from Part IV - Graph-Based Natural Language Processing
Published online by Cambridge University Press: 01 June 2011
Summary
This chapter addresses graph-theoretical methods for text-processing applications. The discussion includes topic identification text summarization using graph-centrality methods; keyword extraction using randomwalk language models; text segmentation using normalized-cut criteria for graph partitioning; graph structures to encode discourse relationships; word graphs for decoding in machine translation and speech processing; randomwalk algorithms for translation selection in cross-language information retrieval; and graph representations and patterns on graphs for information extraction and question answering.
Summarization
Automatic summarization has received attention from the natural language processing community ever since the early approaches to automatic abstraction that laid the foundations of the current text-summarization techniques (Luhn 1958; Edmunson 1969). The literature typically distinguishes between extraction, which is concerned with identification of the information important in the input text, and abstraction, which involves a generation step to add fluency to a previously compressed text.
Most efforts to date have focused on the extraction step, which is perhaps the most critical component in a successful summarization algorithm. Among these efforts, some of the most promising approaches are based on graph representations of the text, which enable the application of graphtheoretical algorithms to identify the most salient elements in the text.
One of the first summarization techniques based on graphs is a method that creates graph representations for encyclopedic articles, in which nodes correspond to paragraphs and edges connect lexically similar paragraphs (Salton et al. 1994, 1997).
- Type
- Chapter
- Information
- Publisher: Cambridge University PressPrint publication year: 2011