Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Last updated 16 July 2024: Online ordering is currently unavailable due to technical issues. We apologise for any delays responding to customers while we resolve this. Alternative purchasing options are available . For further updates please visit our website: https://www.cambridge.org/news-and-insights/technical-incident

Home
> Language models for information…

Chapter 12: Language models for information retrieval

Chapter 12: Language models for information retrieval

pp. 218-233

Authors

, Stanford University, California, , Google, Inc., , Universität Stuttgart
  • Add bookmark
  • Cite
  • Share

Summary

A common suggestion to users for coming up with good queries is to think of words that would likely appear in a relevant document, and to use those words as the query. The language modeling approach to information retrieval (IR) directly models that idea: A document is a good match to a query if the document model is likely to generate the query, which will in turn happen if the document contains the query words often. This approach thus provides a different realization of some of the basic ideas for document ranking which we saw in Section 6.2 (page 107). Instead of overtly modeling the probability P(R = 1|q, d) of relevance of a document d to a query q, as in the traditional probabilistic approach to IR (Chapter 11), the basic language modeling approach instead builds a probabilistic language model Md from each document d, and ranks documents based on the probability of the model generating the query: P(q|Md).

In this chapter, we first introduce the concept of language models (Section 12.1) and then describe the basic and most commonly used language modeling approach to IR, the query likelihood model (Section 12.2). After some comparisons between the language modeling approach and other approaches to IR (Section 12.3), we finish by briefly describing various extensions to the language modeling approach (Section 12.4).

Language models

Finite automata and language models

What do we mean by a document model generating a query?

About the book

Access options

Review the options below to login to check your access.

Purchase options

Purchasing is temporarily unavailable, please try again later

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers