[CSE]  Advanced Operating Systems 
COMP9242 2014/S2 
UNSW
CRICOS Provider
Number: 00098G

PRINTER Printer-Friendly Version

M5: File System

Milestone overview

In this milestone, you will:

  • Implement support for open, close, read, write, get_dirent and stat in SOS.
    • File protection (read/write permissions) should be respected.
    • Non-existing files should be created with read+write permissions by default.
  • Benchmark your file system to show read and write performance.

The SOS file system features a flat directory structure, containing files and the terminal device console. Files have attributes (like size, modification time, protection) that can be accessed through appropriate systems calls. All I/O primitives provided by SOS are synchronous (blocking).

You may want to maintain file descriptors containing a present read position in the file. This requires some minimal Process Control Block (PCB) data structure. Again, you will need to modify this structure as you go on, so do not waste too much time on the full details right now. You may wish to initially only deal with a single client process, data on which is kept in a single PCB variable. This avoids dealing with process IDs for the time being.

The low-level file system is implemented by the NFS file system on your host machine. You are provided with an NFS client library for accessing files on your host Linux machine. Once again, it is your responsibility to understand the working of the provided code.

IP stack and NFS

You need to update the 100ms tick you implemented in milestone 1 to call nfs_timeout defined in libs/libnfs/src/nfs.c. This will cause any packets dropped by NFS to be picked up again.

Apart from the initialisation and mounting filesystems, the NFS library provides an asynchronous interface, which means when you call the function, you must also provide a callback function which will be called when the transaction is completed. This should make it easy to provide a blocking interface to clients, without blocking the whole system.

The NFS server you will be talking to is your host machine. The filesytem you mount is /var/tftpboot/$USER where $USER should be replaced by your cse user name.

Design issues

One of the major issues you will need to deal with in this milestone is converting an asynchronous interface (as provided by libnfs) into a synchronous interface (as required by the system call interface).

If you are planning on doing the file system caching, or dynamic file system advanced component you should probably consider them in your design now.

For file descriptor allocation, we recommend implementing the lowest-available policy, i.e., returning the lowest numbered currently unused file descriptor. This will simplify later milestones.

Benchmarking

Once you have a your file system up and running, you must use your timer to benchmark your filesystem I/O performance. Don't leave this until the last minute as it tends to be more time consuming than you expect (e.g. it finds bugs).

The most obvious (and required) metric is achieved bandwidth for reads and writes. This is simply the maximum number of bytes per second you can transfer from the remote host to your user address space.

Given that you are likely to max-out the slow ethernet adapter in the kit, CPU utilisation is also a useful metric to measure. Given two systems that both max-out the Ethernet adapter, the one with lower CPU utilisation is more efficient. If you decide to look at CPU utilisation, ask your demonstrator or the forum how best to tackle measuring it.

When benchmarking you may find it useful to measure other costs such as latency, copying costs, etc. You may use a spreadsheet program, the gnuplot utility or any other chart generation program.

Note: we expect any reported results to be the average of multiple runs (e.g. >= 5, ideally > 9). Standard deviations are expected for any reported results.

Some things you may like to think about while you're benchmarking are:

  • What are you trying to measure?
  • What are you actually measuring?
  • How much do the measurements themselves affect the timing? (eg. copying/printing)
  • What is the variance/deviation?
  • What are the cache effects (memory, network, disk)?
  • How does read vs. write vs. read/write behave?
  • How does the NFS host machine affect performance. Is it idle during your measurements?

Feel free to discuss benchmarking methodologies etc. and post interim results on the forum. Which group has the most efficient design?


Assessment

You must demonstrate that you have implemented the open, close, read, write, getdirent and stat system calls. The supplied ls and cp commands in sosh can be used, along with your own test code.

You will need to present at least two graphs to your tutor. The two required graphs show bandwidth for file I/O while varying I/O request size and underlying NFS packet size (for appropriately large I/O requests). You must be able to explain why the graphs look the way they do. Your numbers must be taken using your timer driver. You can also show more (pertinent) graphs if you like. The results need to be an average of multiple samples. Standard deviations should be shown if significant.

NOTE: The achieved bandwidth on the graph will not be assessed directly. However, clearly better performance shows a better filesystem implementation. Your tutor is interested in your ability to explain what & how you measured.

As always you should be able to explain any design decisions you made, specifically, the design of your open file table and how you dealt with the asynchronous/synchronous problem.


Last modified: 29 Aug 2014.