Week 07


Graph Traversal


Graph Traversal2/49

A common class of graph algorithms involves

We often talk about graph search rather than graph traversal

Two strategies for graph traversal/search:   depth-first,   breadth-first


... Graph Traversal3/49

Previous expression of algorithms used

The above makes assumption about graph representation (adj.matrix)

To make things more representation-independent, use


Depth-first Traversal4/49

Traverse graph (depth-first, recording visiting order):

int order;     // visiting order
int *visited;  // array of visiting orders
               // indexed by vertex 0..V-1
void dfs(Graph g)
{
   int i;
   visited = calloc(nV(g),sizeof(int));
   order = 1;
   dfsR(g, 0);
}
void dfsR(Graph g, Vertex v)
{
   visited[v] = order++;
   Vertex w;
   for (w = 0; w < nV(g); w++) {      // for each
      if (!hasEdge(g,v,w)) continue;  // neighbour
      if (!visited[w]) dfsR(g, w);
   }
}


... Depth-first Traversal5/49

Visiting order for ...

[Diagram:Pics/graphs/traversalDFS-small.png]


Exercise 1: Depth-first Traversal6/49

Which vertices will be visited during dfs(g):

[Diagram:Pics/graphs/traversal2-small.png]

How can we ensure that all vertices are visited?


... Depth-first Traversal7/49

Modified dfs to handle non-connected subgraphs:

void dfs(Graph g)
{
   int i;
   visited = calloc(nV(g),sizeof(int));
   order = 1;
   while (order < nV(g)) {
      Vertex v;
      for (v = 0; v < nV(g); v++)
         if (visited[v] == -1) break;
      dfsR(g, v);
   }
}


DFS Examples8/49

Some problems to solve via DFS graph search

[Diagram:Pics/graphs/example0-small.png]


Checking for Cycles9/49

A graph has a cycle if

Function  bool hasCycle(Graph g)  tells us this.

We are not required to give the path, just indicate its presence.

[Diagram:Pics/graphs/cycle-ex-small.png]


Exercise 2: Buggy Cycle Check10/49

This DFS cycle check has a bug. Fix it.

int *visited;  // array [0..V-1] of booleans

int hasCycle(Graph g)
{
   visited = calloc(nV(g),sizeof(int));
   int result = dfsCycleCheck(g, 0);
   free(visited);
   return result;
}
int dfsCycleCheck(Graph g, Vertex v)
{
   visited[v] = 1;
   Vertex w;
   for (w = 0; w < nV(g); w++) {
      if (!hasEdge(g,v,w)) continue;
      if (visited[w]) return 1; // found cycle
      return dfsCycleCheck(g, w);
   }
   return 0; // no cycle at v
}


Connected Components11/49

Each vertex belongs to a connected component

Function components(Graph g) sets up componentOf[]

[Diagram:Pics/graphs/components-small.png]


... Connected Components12/49

Assign vertices to connected components:

int ncounted;      // # vertices allocated
int *componentOf;  // array of component ids
                   // indexed by vertex 0..V-1
void components(Graph g)
{
   int i, componentID = 1; 
   componentOf = calloc(nV(g),sizeof(int));
   ncounted = 0;
   while (ncounted < nV(g)) {
      Vertex v;
      for (v = 0; v < nV(g); v++)
         if (componentOf[v] == -1) break;
      dfsR(g, v, componentID);
      componentID++;
   }
   // componentOf[0..nV-1] is now set
}
void dfsR(Graph g, Vertex v, int c)
{
   componentOf[v] = c;
   ncounted++;
   Vertex w;
   for (w = 0; w < nV(g); w++) {
      if (!hasEdge(g,v,w)) continue;
      if (componentOf[w] == 0) dfsR(g,w,c);
   }
}


Breadth-first Search13/49

BFS algorithm (records visiting order):

int *visited; // array [0..V-1] of visiting order

void bfs(Graph g, Vertex v)
{
   int i, order = 1;
   visited = calloc(nV(g),sizeof(int));
   Queue q = newQueue();
   QueueJoin(q,v);
   while (!QueueIsEmpty(q)) {
      Vertex y, x = QueueLeave(q);
      if (visited[x]) continue;
      visited[x] = order++;
      for (y = 0; y < nV(g); y++) {
         if (!hasEdge(g,x,y)) continue;
         if (!visited[y]) QueueJoin(q,y);
      }
   }
}


Exercise 3: Breadth-first Traversal14/49

Show the BFS order we visit vertices in bfs(g,0):

[Diagram:Pics/graphs/traversal-small.png]


Exercise 4: BFS Visiting Order15/49

Change the above algorithm so that visited

I.e. if we visited nodes in the order 0-3-1-4-2


Paths: Simple, Hamilton, Euler


Simple Paths17/49

A simple path is

If v = w, then the path is a cycle.

Questions on paths:


Path Finding18/49

BFS algorithm for path checking:

int hasPath(Graph g, Vertex src, Vertex dest)
{
   int *visited = calloc(nV(g),sizeof(int));
   Queue q = newQueue();
   QueueJoin(q,src);
   int isFound = 0;
   while (!QueueIsEmpty(q) && !isFound) {
      Vertex y, x = QueueLeave(q);
      for (y = 0; y < nV(g); y++) {
         if (!hasEdge(g,x,y)) continue;
         if (y == dest) { isFound = 1; break; }
         if (!visited[y])
            { QueueJoin(q,y); visited[y] = 1; }
      }
   }
   free(visited);
   return isFound;
}


... Path Finding19/49

BFS algorithm for finding shortest path:

int findPath(Graph g, Vertex src, Vertex dest)
{
   Vertex *path = malloc(nV(g) * sizeof(Vertex));
   int *visited = calloc(nV(g),sizeof(int));
   Queue q = newQueue();
   QueueJoin(q,src);
   int isFound = 0;
   while (!emptyQ(q) && !isFound) {
      Vertex y, x = QueueLeave(q);
      for (y = 0; y < nV(g); y++) {
         if (!hasEdge(g,x,y)) continue;

	 if (!visited[y])
            { path[y] = x; QueueJoin(q,y); visited[y] = 1; }

         if (y == dest) { isFound = 1; break; }
         
      }
   }
   if (isFound) {
      // display path in dest..src order
      Vertex v;
      for (v = dest; v != src; v = path[v])
          printf("%d-", v);
      printf("%d\n", src);
   }
   free(visited); free(path);
   return isFound;
}


... Path Finding20/49

The above algorithm finds a "shortest" path

In many applications, edges are weighted and we want We discuss weighted/directed graphs later.


Hamilton Path and Tour21/49

Hamilton path problem:

If v = w, then we have a Hamilton Tour.

Simple to state, but difficult to solve (NP-hard).

Example graph and two possible Hamilton paths:

[Diagram:Pics/graphs/hamilton1-small.png]


... Hamilton Path and Tour22/49

Approach (generate-and-test):

Can be expressed via a recursive DFS algorithm


... Hamilton Path and Tour23/49

Function to find Hamilton path:

Vertex *visited; // array [0..V-1] of bools

int hasHamiltonPath(Graph g, Vertex src, Vertex dest)
{
   visited = calloc(nV(g),sizeof(int));
   int res = HamiltonR(g, src, dest, nV(g)-1);
   free(visited);
   return res;
}

int HamiltonR(Graph g, Vertex v, Vertex w, int d)
{
   int t; 
   if (v == w) return (d == 0) ? 1 : 0;
   visited[v] = 1;
   for (t = 0; t < nV(g); t++) {
      if (!hasEdge(g,v,t)) continue;
      if (visited[t] == 1) continue;
      if (HamiltonR(g,t,w,d-1)) return 1;
   }
   visited[v] = 0;
   return 0;
}


... Hamilton Path and Tour24/49

Analysis: worst case requires (V-1)! paths to be examined

Consider a graph with isolated vertex and the rest fully-connected

[Diagram:Pics/graphs/hamilton-worst-small.png]

Checking hasHamiltonPath(g,0,x) for any x

There is no simpler algorithm for this task ⇒ NP-hard.

Note, however, that the above case could be solved in constant time if
we had a fast check for 0 and x being in the same connected component


Hamilton Path and Tour25/49

Hamilton path problem:

If v = w, then we have a Hamilton Tour.

[Diagram:Pics/graphs/hamilton1-small.png]

No efficient algorithm known (use generate-and-test)


Euler Path and Tour26/49

Euler path problem:

Note: the path does not have to be simple ⇒ can visit vertices more than once)

If v = w, the we have a Euler tour.

[Diagram:Pics/graphs/euler-path-tour-small.png]


... Euler Path and Tour27/49

One possible "brute-force" approach:

Can develop a much better algorithm by exploiting:


... Euler Path and Tour28/49

Assume the existence of degree(G,v) (degree of a vertex)

Function to check whether a graph has a Euler path:

int hasEulerPath(Graph g, Vertex v, Vertex w)
{
   int t = degree(g,v) + degree(g,w);
   if ((t % 2) != 0) return 0;
   Vertex x;
   for (x = 0; x < nV(g); x++) {
      if (x != v && x != w) {
         if ((degree(g,x) % 2) != 0)
            return 0;
      }
   }
   return 1;
}

Analysis:


Exercise 5: Vertex Degrees29/49

  1. Write the degree(g,v) function (assume adj.matrix representation)
  2. Write a function that ...

    Use the function interface:

    void degrees(Graph g, int ds[]) { ... }
    

  3. Use degrees to re-implement hasEulerPath()


Connected Components30/49

Problems:

Both of the above can be solved if we can Function components(g) above does this for us.


Exercise 6: Connectedness Checks31/49

Assume that components(g) has generated componentOf[nV]

  1. Implement sameComponent(g,v,w) to check if v and w are in same subgraph
  2. Implement nComponents(g) to count the number of subgraphs

[Diagram:Pics/graphs/components-ex-small.png]


... Connected Components32/49

Consider an application where connectivity is critical

Add a new fields to the GraphRep structure:

struct GraphRep {
   ...
   int nC;  // # connected components
   int *cc; // which component contains each vertex
   ...      // i.e. array [0..nV-1] of 0..nC-1
}


... Connected Components33/49

With this structure, the above tasks become trivial:

// How many connected subgraphs are there?
int nComponents(Graph g)
{
   return g->nC;
}
// Are two vertices in the same connected subgraph?
int sameComponent(Graph g, Vertex v, Vertex w)
{
   return (g->cc[v] == g->cc[w]);
}


... Connected Components34/49

Consider maintenance of such a graph representation:

Additional maintenance cost amortised by much reduced cost for nComponents() and sameComponent()


Exercise 7: Implementing Component Array35/49

Add the nC and cc fields to a GraphRep

Modify the following functions to maintain them:

Graph newGraph(int nV) { ... }
void  insertE(Graph g, Edge e) { ... }
void  removeE(Graph g, Edge e) { ... }

Consider adding the dotted edges to the following graph

[Diagram:Pics/graphs/component-maint-small.png]


Directed Graphs (Digraphs)


Directed Graphs (Digraphs)37/49

In our previous discussion of graphs:

In most real-world applications of graphs:


... Directed Graphs (Digraphs)38/49

Example digraph and adjacency matrix representation:

[Diagram:Pics/graphs/digraph1-small.png]

Non-directional symmetric matrix
Directional non-symmetric matrix


... Directed Graphs (Digraphs)39/49

Terminology for digraphs ...

Directed path:   list of n ≥ 2 vertices v1, v2, ... vn

Degree of vertex:   d(v) = number of edges like (v, _)

Indegree of vertex: d-1(v) = number of edges like (_, v)

Reachability:   w is reachable from v if directed path v,...,w

Strong connectivity:   every vertex is reachable from every other vertex

Directed acyclic graph (DAG):   graph containing no directed cycles


Digraph Applications40/49

Potential application areas:

Domain Vertex Edge
Web web page hyperlink
scheduling task precedence
chess board position legal move
science journal article citation
dynamic data malloc'd object pointer
programs function function call
make file dependency


... Digraph Applications41/49

Problems to solve on digraphs:


Digraph Representation42/49

Similar set of choices as for non-directional graphs:

[Diagram:Pics/graphs/digraph-reps-small.png]


... Digraph Representation43/49

Costs of representations:    (where degree d(v) = #edges leaving v)

Representation  Space  Add edge  Exists
edge (v,w)?
  Get edges
leaving v
Adjacency matrix  V2  1  1  V 
Adjacency lists  V+E  d(v)  d(v)  d(v) 
Adjacency sets  V+E  d(v)  log(d(v))  d(v) 

Overall, adjacency sets representation (using sorted arrays) is best


PageRank44/49

Goal: determine which Web pages are "important"

Approach: ignore page contents; focus on hyperlinks

Problem: the Web is a very large directed graph Assume for the moment that we could build a graph ...

Most frequent operation in algorithm "Does edge (v,w) exist?"


Exercise 8: Estimate the size of the Web45/49

How could you work out ...

Google indexes ≅25×109, Bing indexes ≅14×109, ...


What makes it difficult to work out?


One attempt: http://www.worldwidewebsize.com/


... PageRank46/49

Simple PageRank algorithm:

PageRank(myPage):
   rank = 0;
   foreach (page in the Web)
      if (linkExists(page,myPage)) rank++;

Note: requires inbound link check (not outbound as assumed above)

For analysis:


Exercise 9: Checking inbound links47/49

Write a function

Vertex *inBound(Graph g, Vertex v, int *nInBound)

that returns a count and array of vertices w s.t. (w,v)

Do this for


... PageRank48/49

Costs for computing PageRank for each representation:

Representation  linkExists(v,w)  Cost
Adj. matrix  edge[v][w]  V.1
Adj. lists  search(w,list[v])  V.(E/V) = E
Adj. sets  isElem(w,set[v])  V.log(E/V)

Not feasible ... V4×1010 ⇒ matrix has 1021 cells

I.e. we can't store the entire web as a Graph structure

So how to really do it?


... PageRank49/49

Approach: the random web surfer

curr = random page
for (a long time) {
   if (!isElem(seen, (prev,curr)))
      if (!inArray(rank,curr)) rank[curr] = 0
      rank[curr]++
      seen = include(seen, (prev,curr))
   if (random(0,100) < 85)
      prev = curr
      curr = choose hyperlink from curr
   else
      // avoids getting stuck
      prev = null
      curr = random page
}

Could be accomplished while we crawl web to build search index


Produced: 6 Sep 2017