Link Analysis for the World Wide Web

Rada Mihalcea; Dragomir Radev

doi:10.1017/CBO9780511976247.006

5 - Link Analysis for the World Wide Web

from Part III - Graph-Based Information Retrieval

Published online by Cambridge University Press: 01 June 2011

Rada Mihalcea and

Dragomir Radev

Show author details

Rada Mihalcea: Affiliation:
University of North Texas
Dragomir Radev: Affiliation:
University of Michigan, Ann Arbor

Book contents

Get access

Summary

This chapter addresses link-analysis methods used by search engines, such as PageRank and HITS, and covers topics relevant to their application, including method stability, the combination of link- and content-based models, topic-sensitive ranking, and query-dependent link analysis.

The Web as a Graph

The Web – a common abbreviation for the World Wide Web – consists of billions of interlinked hypertext pages. These pages contain text, images, videos, or sounds and are usually viewed using Web browsers, such as Firefox or Internet Explorer. Users can navigate the Web by either directly typing the address of a Web page (i.e., the URL) inside a browser or following the links that connect Web pages among them.

The Web is a typical example of a graph, with Web pages corresponding to vertices in the graph and links between pages corresponding to directed edges. For instance, if the page http://www.unt.edu includes a link to the page http://www.cs.unt.edu and another to the page http://www.htsc.unt.edu, and the latter page in turn links to the page of the National Institutes of Health http://www.nih.gov and also back to the http://www.unt.edu page, it means that these four pages form a subgraph of four vertices with four edges, as illustrated in Figure 5.1.

Although the size of the Web is generally considered to be unknown, there are various estimates concerning the size of the indexed Web – that is, the subset of the Web that is covered by search engines.

Type: Chapter
Information: Graph-based Natural Language Processing and Information Retrieval , pp. 91 - 105

DOI: https://doi.org/10.1017/CBO9780511976247.006 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

5 - Link Analysis for the World Wide Web

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive