UNSW   Faculty of Engineering PRINT VERSIONSITE MAP  
cse | School of Computer Science and Engineering (CRICOS Provider No. 00098G)
    #About CSE     #Undergraduate Study     #Postgraduate Study     #Timetables & Courses     #Research & Publications     #People & Work Units     #Help & Resources     #News & Events     #High School Portal

Last updated 03.05.09

Distributed Systems [COMP9243]

Session 1, 2009 - Assignment 2


Please read the assignment carefully! There are many suggestions contained in the specification that will make your life significantly easier. Furthermore, check back regularly, as we will put clarifications up here if there are misunderstandings or bugs in the spec.

Printer-friendly version.
 
News:
This is an individual assignment. No team work is permitted.

Introduction

Your task is to implement, in Erlang, a simple message router network whose main task is to route simple messages between the nodes that make up the network. In addition to routing messages, the router network needs to support atomic updates of its routing configuration as well as a form of checkpointing by saving snapshots of the network state to disk. These snapshots can, at a later point in time, be used to restart the network from the saved state.

Network Structure

The router network consists of a collection of router nodes that are connected to each other by labelled edges, forming a directed, labelled, strongly-connected graph (which we will call the network graph). A message routed through the network starts at one node and is passed on between other nodes over the edges until it reaches its destination node. The nodes are named and the edge labels direct message routing in the network. Specifically, an edge label consists of a set of node names that specify which nodes can be reached by sending a message over that edge. Thus, for example, a node will forward a message whose destination is A over the edge whose label contains A.

The following figure shows an example of a network graph.

Based on the edge labelling in this graph, a message starting from node red with destination green would be sent from red to white, then from white to blue, and then from blue to green.

The router network also contains a single controller node that is responsible for building and maintaining the network. This node is not part of the network graph and is not involved in message routing.

In this assignment each node is implemented as a separate Erlang process (i.e., there will be one controller process and several router processes). Communication between processes is done using Erlang's communication primitives. Direct communication between router processes is only allowed if an edge exists between the corresponding nodes in the graph. The controller process can communicate directly with each router process.

Each router process maintains a routing table that represents the information contained in the corresponding node's edge labels. Such a routing table contains an entry for every node in the graph pointing to the next intermediary node to which a message for that node must be routed. The routing table will be described in more detail later.

Network Reconfiguration

The network layout can be changed by control requests. To be precise, control requests can change the network in two ways:

  1. they can create new nodes by spawning new router processes;
  2. they can change the graph by modifying the contents of existing routing tables.

Note that whenever new router processes are created, the routing tables also change, otherwise the new nodes would never be reachable.

To avoid inconsistent states of the network, it is important that a given control request is either successfully executed by all router processes in the network, or alternatively fails to modify any process. This atomic update of the distributed routing information is to be realised by a variant of the two-phase commit protocol (2PC) discussed in the lecture. The main difference to the standard protocol is that the controller does not directly contact all router processes. Instead, it only communicates with a single router process (called the root router process for that control request) that propagates the request into the rest of the network. The root router eventually informs the controller whether the request was successful. Different control requests may use different root routers.

Any router process that is engaged in 2PC will cease to forward routed messages until it has either committed to the update or aborted the request. (This is not as efficient as the system could be, but it simplifies matters greatly.)

Snapshot

Upon request by the controller, the network must be able to compute a consistent distributed snapshot using the Chandy-Lamport snapshot protocol. The snapshot will dump the routing tables of all router processes, as well as all messages currently underway, to disk. The network, including the stored messages, can be re-created by the controller from the disk image at a later point in time. It is imperative that, during the computation of the snapshot, all router processes continue to forward routed messages. This means that, in contrast to 2PC, snapshots must not halt the routing process.

Assignment

The code for the controller and router processes must be implemented in two separate Erlang modules called control and router, respectively. The interface of these two modules is fully defined in this specification; this includes both the set of exported functions as well as the format and semantics of the mandatory messages sent and understood by these functions. It is important that you strictly adhere to this specification since the two modules will also be tested in isolation (e.g., your router code will be tested with our controller code).

This assignment does not make full use of Erlang's support for distributed processes. Distributing processes in Erlang is very easy (and therefore doesn't tell us much about your skills), but it makes testing harder. Hence, you will only be required to work with multiple Erlang processes on a single processing node.

Router Process

The main purpose of the router process is to route messages across the network. These messages do not carry a "real" payload, but instead accumulate a trace of the names of the encountered router nodes. Once a message reaches its final destination, the destination process sends the trace to a controller process identified in the message. The purpose of all this (keeping route traces and sending them to the controller) is to be able to test the routing functionality including the configuration of the routing tables.

The actual routing table maintained by each router process maps the node names of message destinations to process identifiers. Whenever a message is forwarded, it is sent to the process associated with the message's destination in the local routing table. The correspondence between a process' routing table and the network graph is as follows: Given the process identifier Pid of a current process' neighbouring process, the set of all destinations that are mapped to Pid by the routing table forms the label on the graph edge between the current process and the process identified by Pid.

As an example, the following table corresponds to the routing table that would be maintained by the process for node white in the network graph shown previously (where Pid_blue is the pid of the process for node blue and Pid_red is the pid of the process for node red).

DestPid
redPid_blue
bluePid_red
greenPid_blue

To represent the routing table, use Erlang's ets module. In particular, to save the routing table to a file, use the function ets:tab2file and, to restore the routing table from a file, use file2tab.

The configuration of a router process is fully determined by its name and routing table. However, for practical purposes the routing table of a router process also needs to store the number of incoming graph edges to the node. Store this number in the same ets as the rest of the routing table, but store it under the key '$NoInEdges' (the single quotes are needed to make this an atom in Erlang). It is crucial that you use this key name, as some autotests may depend on it.

Function Interface

The module router exports only a single function:

start (RouterName)
Spawn a new router process implementing a router node named RouterName (which must be an atom whose lexeme consists of characters, numbers, and underscores only). The return value is the same as that produced by the Erlang function spawn/1; in particular, if process creation is successful, the pid of the new router process is returned.

Control Process

The control process builds a router network by starting the individual router processes. It also exercises the network by sending messages into the network and recording the responses. Furthermore, the control node may alter the configuration of the network using control messages, triggering snapshots, and re-creating a network from a snapshot.

Note that the control process does not serve as a central authority during control operatins such as atomic configuration updates or when computing snapshots.

Function Interface

The module control exports the following three functions:

graphToNetwork (Graph)
Creates a set of router processes from a description of the network graph. The format of the graph description is as follows:
  • Graph is a list of node specifications of the form {NodeName, Edges}, where NodeName is the symbolic name of the node (to be used with router:start/1) and Edges describes the outgoing edges of that graph node.
  • Edges: the outgoing edges of a graph, are represented by a list of edge descriptions of the form {Dest, Names}, where Dest is the symbolic name of the node to which this graph edge points and Names describes the edge label.
  • Names: an edge label, is a list of node names identifying the messages that are to be routed via a particular edge (i.e., those messages having a destination node that appears in Names).

During construction, the network will not always be strongly connected. Moreover, a newly started router process must be initially configured by sending it a single control message that uses 0 as its sequence number. The control process sends this message to each node individually, it is not intended to be propagated through the network.

Return value: This function returns the pid of the router process implementing the node corresponding to the first entry in Graph.

recover (SnapName)
Re-create the router network described in SnapName and inject all routed messages listed in the snapshot into the new network. It is assumed that the snapshot was generated by a snapshot request as described in this specification. SnapName identifies the directory in which the snapshot files are contained.

During construction, the network will not always be strongly connected. Moreover, the initial configuration of a newly started router process must be by way of a single control message that uses 0 as its sequence number.

Return value: An ets table representing a finite map from node names to pids for all recovered router nodes; i.e., the table contains pairs {NodeName, NodePid} for each node in the recovered snapshot.

extendNetwork (RootPid, SeqNum, From, {NodeName, Edges})
Extend an existing router network by a single node. RootPid is the pid of the router process to which the controller will originally send this request and SeqNum is the sequence number that is to be used in the control request. The network extension is to be performed by the router node named From (this is a node name, not a pid), which creates a new node named NodeName whose routing table is described by Edges. The format of Edges follows that of the node entries of the graph description used in graphToNetwork with the exception that edge endpoints are identified by pid, not by node name.

After the network has been successfully extended, the routing tables of the individual nodes in the resulting network must have changed as follows:

  • Node From: The routing table has a single additional entry. It is for NodeName and refers to the newly created process.
  • Node NodeName: The routing table is configured as specified by Edges.
  • All other nodes: The routing table has a single new entry. It is for NodeName and refers to the same process as the entry into the table indexed by From (indicating that all messages to the new node NodeName are routed via From).

Immediately after creation of the new process, the network will not be strongly connected. Moreover, the initial configuration of the newly started router process must be by way of a single control message that uses 0 as its sequence number.

Return value: This function returns true or false indicating whether the extension was successful or not. The operation is not successful exactly if the control request implementing the network extension aborts.

Messages

The main message loop of a router node must understand the following messages. In addition, router nodes need to use other messages during 2PC and snapshots that you may define as you see fit.

{control, From, Pid, SeqNum, ControlFun}
Control message: These messages are used to propagate control requests throughout the network. The argument From contains the pid of the process that sent this message, Pid identifies the controller process that initiated the control request, SeqNum is a unique sequence number for this request, and ControlFun is a functional argument that implements the re-configuration operation specified by the control request. The ControlFun is to be applied as follows:
Children = ControlFun (Name, Table)
In this example, Name is the name of the router node where the function is applied (the same name as passed as an argument to the function router:start/1). The argument Table is the routing table of that node. The result, called Children above, equals the atom abort if the configuration function failed and the node will not be able to commit to this control request; otherwise, Children is a list of the pids of any processes spawned by ControlFun, which need to be killed if the 2PC for this control request fails at a later point. The controller will only send a single control message to some router in the network, called the root router process for this request. The root router, which is identified by receiving a control message where From == Pid will eventually reply to Pid with {committed, self (), SeqNum} or {abort, self (), SeqNum} depending on whether the control request succeeded in re-configuring the network or not.

A process engaged in a 2PC must abort if it does not receive a 2PC-related message for 5 seconds. (This holds only for the first phase.)

{message, Dest, From, Pid, Trace}
Routed message: A message that is to be routed to the node called Dest (i.e., to the router process started as router:start (Dest)). The argument Trace contains a list of the names of router nodes through which this message already passed, in reverse order. When the message finally reaches the destination router, this router must send the complete trace (in traversal order) to the process identified by Pid. The argument From is the pid of the process that sent this message. As an example, consider that a message {message, green, Pid, Pid, []} is send to the router node called red. Suppose that red forwards the message to yellow, which sends it to green (i.e., the destination node). Then, green must send {trace, GreenPid, [red, yellow, green]} to Pid, where GreenPid is the pid of the router process green.
{dump, From}
Routing table dump request: This message must be answered by sending a message of the form {table, self (), Dump} to From, where Dump is the structure obtained by evaluating the expression ets:match (Table, '$1'), assuming Table is the routing table of the node.
{snapshot, From, Pid, SnapName}
Snapshot request: This message is used to implement the Chandy-Lamport protocol for distributed snapshots. The argument From identifies the sender of the message and Pid indicates the process that initiated the snapshot. Finally, SnapName is a string that identifies the snapshot and simultaneously serves as the pathname of the directory in which the snapshot files must be stored. For testing and debugging purposes we also require that whenever a router process receives the first such message during the execution of a snapshot, it must forward this message as {snapshot, self (), Pid, SnapName} to the process Pid. Moreover, as soon as a router has received snapshot messages on all incoming edges (i.e., it has collected all messages that must be saved as part of the snapshot), it must send a {snapshotSaved, self (), SnapName} message to Pid. When a new snapshot is initiated by the controller, for the initial snapshot message sent to a router process, we have From == Pid. The controller may send this initial message to any router in the network and will not send any further messages concerning this snapshot to router processes. It is the responsibility of the router process that receives this initial snapshot message to eventually reply to Pid with either {snapshotComplete, self (), SnapName} or {abort, self (), SnapName}, depending on whether the snapshot succeeded or not.

A process that receives a snapshot request with a snapshot name SnapName must generate two files within the directory SnapName (snapshot names are guaranteed to be valid directory names). The names of the files are <node-name>.<node-pid>.table and <node-name>.<node-pid>.msgs, where <node-name> is the name of the node (as passed to the function router:start/1) and <node-pid> is the pretty-printed pid (using pid_to_list) of the process implementing that same router node. The file <node-name>.<node-pid>.table must contain the routing table in the format generated by ets:tab2file and the file <node-name>.<node-pid>.msgs must contain the messages collected by <node-name> during that snapshot in the format of a list of pairs {Dest, Trace}, where Dest is the destination of the message and Trace the trace accumulated so far (in reverse order). This list of pairs must be represented in the format generated by the Erlang function io:write/2.

Make sure that the list of pairs stored in the .msgs files is followed by a dot ("."). This allows these files to be easily read using file:consult/1.

stop
Terminate router: The router process receiving this message must terminate after de-allocating its routing table.

Message Flow

It is important to note that routed messages, on one hand, and control messages, snapshot requests, and other messages needed for 2PC and snapshots, on the other hand, flow through the network in entirely different ways.

  • Routed messages: Routed messages may only flow along edges in the network graph whose label contains the name of their destination node; i.e., the routing tables need to be used to route these messages.
  • Control messages and snapshot requests: These two kinds of messages flow along graph edges. Any router process that receives such a message forwards it to all processes that are mentioned in its routing table (i.e., it propagates the messages along all outgoing graph edges). Since the graph is strongly connected, any such message will eventually reach every node in the graph. The nice thing about this behaviour is that whoever sends the initial control message or snapshot request does not need to know all the nodes contained in the network.
  • Other messages needed for 2PC and snapshots: Some messages, such as the committed or abort messages generated after a control request has been executed, flow through the network in the opposite direction as control messages and snapshot requests. These messages ignore the routing table entirely.

Why this set up? Routed messages are the data in the network. The central purpose of the routing table is to route that data. Control and snapshot messages flow along graph edges so that no global table containing the pids of all router processes has to exist. This is good distributed systems design: avoid central resources. Messages that are effectively answers to control and snapshot messages need to flow opposite to how control and snapshot messages flow; control and snapshot messages contain the From argument to allow for such direct replies.

Implementation Hints and Further Instructions

Further Instructions

  • Consider whether it is possible that, after a 2PC or snapshot aborts, there may be messages in the network that will never be consumed so that repeated occurences of such events slowly fill up the mailbox of individual processes. Implement measures to prevent that garbage from accumulating in mailboxes.
  • During the construction of a router network (or parts thereof), such as performed by the functions graphToNetwork/1, recover/1, and extendNetwork/4, the network will go through intermediate states, where it is not strongly connected (e.g., immediately after a router process is started, it is not connected at all). After a router process has been started, it must receive an initial routing table before it can participate in any further operations. By convention, the control message for this initial configuration always has the sequence number 0 (this convention merely exists to simplify making all sequence numbers distinct). However, after any of graphToNetwork/1, recover/1, and extendNetwork/4 completes, the network must be strongly connected, which in the case of graphToNetwork/1 and extendNetwork/4 requires that correct arguments are passed to the functions.
  • Snapshot requests and the 2PC of control requests are mutually exclusive. That is, if a node participating in a snapshot computation receives a control request, it aborts that control request; conversely, when a node that is participating in a 2PC receives a snapshot request, it aborts that snapshot request. This is the only reason for which snapshots may be aborted.

Implementation Hints

The following are merely hints to help you. You are not required to proceed as indicated here if you prefer an alternative implementation.

  • The module lists contains a number of useful functions that may save you some coding.
  • The Erlang modules io and file contain generally useful I/O routines.
  • Erlang exceptions can make the code significantly more elegant.
  • Look at the FAQ slides (PDF) for some helpful information and examples.

Testing Code

Here follows some simple testing code to provide concrete examples of the use of the router network. However, please note that this code is not sufficient to completely test a solution to this assignment. You will need to develop additional tests.

A Simple Network Graph

The following describes a simple network graph in the format expected by graphToNetwork/1. It describes the graph shown earlier in this specification:

simpleNetworkGraph () ->
  [{red  , [{white, [white, green]},
	    {blue , [blue]}]},
   {white, [{red, [blue]},
	    {blue, [green, red]}]},
   {blue , [{green, [white, green, red]}]},
   {green, [{red, [red, blue, white]}]}
  ].

A Simple Network Test

The Erlang module networkTest implements network validation against a graph specification of the network by exhaustive path coverage.

A Network Reconfiguration Test

The Erlang module controlTest demonstrates the use of a control request by reversing the edge direction of a small (three node) cyclic graph. This module needs networkTest to operate.

A Network Extension Test

The Erlang module extendTest demonstrates the use of the function control:extendNetwork/4. This module needs networkTest to operate.

A Snapshot & Recovery Test

The Erlang module snapshotTest demonstrates the use of snapshot requests and the function control:recover/1. This module needs networkTest to operate. The function snapshotTest:runTest/0 uses a hardcoded directory called snap1/ to save the snapshot. You must make sure that this directory is empty before executing the test. Warning: If the test reports lost messages, that does not necessarily imply a problem with your code; i.e., this test is overly strict.

Deadline & Submission Procedure

The submission deadline is Sunday, 24 May 2009 (23:58). Please follow the submission guidelines (otherwise, you may lose marks). In addition to your code, you have to submit a design document in a text file (pure ASCII text, nothing else) called DESIGN. This file must outline the design of your implementation and highlight any special features or shortcomings. You will receive style marks for this document. In particular, the following points must be addressed in the design document:

  • Provide an exact description of the algorithm and protocol (including the format of the various messages and their semantics) used to implement two-phase commit.
  • Provide an exact description of the algorithm and protocol (including the format of the various messages and their semantics) used to implement consistent snapshots.


COMP9243, School of Computer Science and Engineering, University of New South Wales CRICOS Provider Number: 00098G

This page is maintained by cs9243@cse.unsw.edu.au Last modified: Sunday, 03-May-2009 22:11:00 EST

Top Of Page

 ###
Site maintained by webmistress@cse.unsw.edu.au
Please read the UNSW Copyright & Disclaimer Statement