[prev] 44 [next]

PageRank

Goal: determine which Web pages are "important"

Approach: ignore page contents; focus on hyperlinks

  • treat Web as graph: page = vertex, hyperlink = di-edge
  • pages with many incoming hyperlinks are important
  • need to computing "incoming degree" for vertices
Problem: the Web is a very large directed graph
  • approx. 1011 pages,   1013 hyperlinks
Assume for the moment that we could build a graph ...

Most frequent operation in algorithm "Does edge (v,w) exist?"