Advanced Operating Systems
M5: A timer driver and benchmarking
In this milestone you will implement a timer driver, and the two timer-related system calls. The timer-related system calls must be implemented using your driver. Using your timer driver you will then benchmark file I/O performance.
For many purposes, such as, benchmarking, a high clock resolution is desired. The NSLU2/IXP420 platform features a timer cell which contains several high-frequency counters which can are used as timers. By measuring the performance of your file system you will gain an understanding of how to measure a system's code. Benchmarking is also a great way to test stability.
Your driver needs to export the interface specified in clock.h. There are the following functions:
The interface is just an internal funciton call interface. You do not
need to export this interface to the users. User programs will indirectly
access the clock driver through the
The NSLU2 "Slug"
This is a block diagram of the IXP420 network controller on which the slug is based. The IXP420 is a single chip computer which has an XScale processor and lots of dedicated hardware for I/O. The platform hardware reference manual can be found in .
Although the Slug is a very powerful platform with lots of functionality we will only be doing a driver for the 'timers' cell on the "Advanced Peripheral Bus"(APB). If you are interested in the full arcana of device drivers for the other cells particularly the networking stacks then explore the directories libs/ixp_isal and libs/ixp400_xscale_sw, but be warned it is really, really hard to read and understand code.
Timer cell (OSTS)
Your main job is to learn how to program the IXP420's timer cell (OSTS) to generate timer interrupts and how to write an L4 driver to handle these interrupts.
The OSTS contains four functions, 2 general purpose 32 bit timers, a 32 bit time counter and an emergency watchdog timer. For our purposes, we are only interested in the OSTS as our source of timer interrupts. You will be using one of the general purpose timers and the timestamp register. The interrupt vectors for the timers are connected to the 5, 11 & 14 interrupt lines, see nslu2.h for meaningful names.
There is an abundance of different devices and writing device drivers is usually seen as a very difficult task. This is true in a sense. However, programming a device is really just a matter of learning about its registers, what values to read and write to those registers and when to do it.
The minimal subset of the OSTS's functionality that you must understand and use is listed below (the numbers in parentheses are the offsets of the register addresses, in bytes, from the OSTS's base address). Refer to the reference manual for more complete descriptions Chapter 14 of the developer's manual.
Again, this is only the minimal understanding that you need. For deeper understanding, you are encouraged to learn about the other registers and can even play with those as well.
NOTE: This section is deliberately kept short (e.g., we do not dictate which timer to use or in what mode to use it in). The idea is that you work things out for yourself and make your own design and implementation decisions. There are only two conditions that must be satisfied:
The Pistachio/ARM kernel exports specific interrupts to a user level
interrupt handler via IPC. User level threads are associated with
interrupts by the privileged task using the
Refer to the L4 reference manual for further information on the interrupt registration and delivery protocol, in sections 2.4 ThreadControl and 7.2 Interrupt Protocol respectively.
In Pistachio/ARM, device registers are memory mapped. That is, hardware registers can be accessed via normal load/store operations to special addresses. To access device registers, you must first map the memory with the appropriate attributes, see Chapter 4 in the L4 reference manual. NB that all accesses device registers must bypass the cache. When requesting a device map you must specify that you want unchached mappings.
You may need to resolve some or all of these issues:
The file clock.h, as well as some other header files you might find useful, can be found here.
Once you have a working clock driver, and have verified it is timing accurately, you must use it to benchmark your filesystem I/O performance.
The most obvious (and required) metric is achieved bandwidth for reads and writes. This is simply the maximum number of bytes per second you can transfer from the remote host to your user address space.
When benchmarking you may find it useful to measure other costs such as latency, copying costs, etc. You may find the gnuplot utility useful.
Some things you may like to think about while you're benchmarking are:
Feel free to discuss benchmarking methodologies etc. and post interim results on the Wiki.
Should be able to show some test code that uses all the functions specified in the driver interface.
You should also show some user level test code that uses the
You will need to present at least two graphs to your tutor. The two required graphs show bandwidth for file I/O while varying I/O request size and underlying NFS packet size (for appropriately large I/O requests). You must be able to explain why the graphs look the way they do. Your numbers must be taken using your timer driver. You can also show more (pertinent) graphs if you like.
NOTE: The achieved bandwidth on the graph will not be compared group-to-group. However, clearly better performance shows a better filesystem implementation. Your tutor is interested in your ability to explain what & how you measured.
Last modified: 05 Sep 2006.