Exercises: Starting Remote Processes


Remote Process Starting Tool

Introduction

In distributed systems it is often required to start processes on many remote nodes, before they can actually work together over the network. In this exercise you have to implement a tool that allows a user to automatically start a specific process on certain remote nodes.

The tool will have to accept a number of arguments, such as the name of the executable that has to be started remotely, the list of nodes where processes must be started, and a list of arguments to be passed to the newly started processes.

This exercise is not marked and not compulsory, but the code that you develop can be used as a part of the first marked assignment. We therefore highly encourage you to do this exercise before assignment 1 starts.

rstart

The program that starts the node processes is called rstart. It is started with command line parameters that specify an executable to run, the number of instances to start, and a set of hosts on which to start those instances. It launches the specified number of program instances on the specified set of hosts. The following brief usage specification of rstart shows exactly how the program must behave.

Usage: rstart [OPTION]... EXECUTABLE-FILE NODE-OPTION...

  -H HOSTFILE list of host names
  -h          this usage message
  -n N        fork N node processes
  -v          print version information


Forks N copies (one copy if -n not given) of
EXECUTABLE-FILE.  The NODE-OPTIONs are passed as arguments to the node
processes.  The hosts on which node processes are started are given in
HOSTFILE, which defaults to `hosts'.  If the file does not exist,
`localhost' is used.

In this specification EXECUTABLE-FILE is the program to start on the remote nodes. The NODE-OPTION arguments are the parameters that are passed to the program. The rstart program starts N node processes and assigns each node process an identifier between 0 and N-1. The HOSTFILE argument names a text file that specifies the nodes on which the node processes will be started. Each line of this file contains a single hostname. The first node process is started on the first named host, the second process on the second named host, and so on. If there are more node processes than host names, the host assignment process goes back to the the first host name after having started a node process on the last host name in the list. If the -H option is not given, the name of the hosts file defaults to hosts. If the file does not exist or is empty, localhost is used as the only default hostname.

The parameter EXECUTABLE-FILE will usually be an absolute file name, and must refer to the exact same executable on all hosts listed in the hosts file. It is up to the user of rstart to ensure that this condition is met. On the vina cluster, NFS ensures that all home directories are uniformly available on all nodes.

Environment

You can develop and test your code on our vina cluster (hostnames vina00 to vina09). The exercise should be coded in ANSI C. You can make use of ssh to start remote processes.

This page is maintained by cs9243@cse.unsw.edu.au Last modified: Monday, 07-Sep-2020 22:07:44 AEST