Explanations for Part-2

By Zain Afzal



What on earth is this math, I'm not a math person

Math, more like no

Zain

Yeah that's fair, it was confusing to me too, so lets break down each centrality.

Closeness

This for a graph with no isolated components is simply the sum of the shortests paths from one node to every other node. For graphs with isolated components (like the ones given in this assignment) it's the same idea but we scale the values we get at the end, but lets go through the simple idea first. Consider the example below.

We need to know the shortest distance from each node to every other node, so for the sake of going through this on paper lets make a table (this doesn't mean your code has to use a 2D matrix, i'm just going through the algorithm)

Note that i don't care about the distance from a node to itself (since that's meaningless, it's like saying how far is my house while in your house)

Anyway the table looks like this

0 1 2
0 X
1 X
2 X

So we know that the shortest way to get to 2 from 0 is to just relax the edge from 0 to 2 with weight 1

Hence the distance of 2 from 0 is 1

0 1 2
0 X 1
1 X
2 X

We repeat this with all other nodes to get

0 1 2
0 X 2 1
1 2 X 3
2 3 1 X

Now we sum up, the closeness of node 0 is 2+1 = 3 for node 1 it's 2 + 3 = 5 and for node 2 it's 3+1 = 4

Finally we normalise, by taking each of these sums and dividing under nV-1

for us we have 3 nodes so Nv-1 is 2

so our values become

closeness centrality
0 2/3 = 0.666667 (rounded)
1 2/5 = 0.4
2 2/4 = 0.5
Note that this works well for simple test cases with no isolated components but with graphs where there isn't a path from every single node to every single other nodes you need to use the eqation provided in the specs.

Betweenness

Betweenness is a measure of how important a node is in a graph, i.e if we took it away how many nodes would become disconnected.

We can measure this by counting how many shortest paths pass through a certain node.

In the same graph from above we can build a table if every pair and the shortest paths between the two (if there are two or more paths with the same distance, all minimal, we take them all)

pair paths
0 - 0 [0]
0 - 1 [0, 1],[0, 2, 1]
0 - 2 [0, 2]
1 - 0 [1, 0]
1 - 1 [1]
1 - 2 [1, 0, 2]
2 - 0 [2, 1, 0]
2 - 1 [2, 1]
2 - 2 [2]

now ignore the start and end nodes and count how many times each node appears. i.e if we have [0,1,2] we don't add 1 to the counts of 0 or 2, just 1 because 1 is the only node that is "between"

Appearences
0 1
1 1
2 1

Now we have to do a extra step. If we count a node that appears in 1 of many shortest paths, such as the 2 in the path from 0 to 1. We have to scale it's value down. Since it's only 1 of many shortests paths the node isn't that important cause if we killed it the other shortest paths would still exist.

So for each pair we take the number of times a certain node appears in the path and divide by the total number of paths.

so for the case of 2 we get 1/2. there are 2 shortests paths and 2 appears in 1 of them.

Appearences
0 1
1 1
2 0.5