[prev] 86 [next]

PageRank

Goal: determine which Web pages are "important"

Approach: ignore page contents; focus on hyperlinks

  • treat Web as graph: page = vertex, hyperlink = directed edge
  • pages with many incoming hyperlinks are important
  • need to compute "incoming degree" for vertices
Problem: the Web is a very large graph
  • approx. 1014 pages,   1015 hyperlinks
Assume for the moment that we could build a graph …

Most frequent operation in algorithm "Does edge (v,w) exist?"