[CSE]  Advanced Operating Systems 
 COMP9242 2007/S2 
CRICOS Provider
Number: 00098G

PRINTER Printer-Friendly Version
- Notices
- Course Intro
- Consultations
- Lectures
- Selected Papers
- Project Spec
- Exam
- Forums
2006 Wiki
Project Resources
Slug Lab
L4 Debugging Guide
Developing on a Mac
Developing on Linux

L4 reference manual
L4 user manual
IXP42X hardware manual 
NSLU2-Linux HomePage
Intel IXP400 Software

Related Info
IBM OS Prize
OS Hall of Fame
- 1997
- Gernot Heiser
- Kevin Elphinstone (LiC)
- Guest Lecturers (TBA)
- Student Reps

Valid HTML 4.0!

M5: A timer driver and benchmarking

In this milestone you will implement a timer driver, and the two timer-related system calls. The timer-related system calls must be implemented using your driver. Using your timer driver you will then benchmark file I/O performance.


  • Learn and understand the fundamentals of writing a device driver.
  • Play with real hardware.
  • Learn about interrupt handling in L4.
  • Time system behaviour on L4.


For many purposes, such as, benchmarking, a high clock resolution is desired. The NSLU2/IXP420 platform features a timer cell which contains several high-frequency counters which can are used as timers. By measuring the performance of your file system you will gain an understanding of how to measure a system's code. Benchmarking is also a great way to test stability.

The Driver Interface

Your driver needs to export the interface specified in clock.h. There are the following functions:

Initialises the driver.
Registers the calling thread to receive an IPC message after a specified time interval (in microseconds, though actual wakeup resolution will depend on the timer resolution). Several registrations may be pending at any time.

The IPC message sent by the timer contains in MR0 a return code (TIMER_R_OK, TIMER_R_UINT, TIMER_R_CNCL) and in MR1 and MR2 the present value of the real-time clock.

Returns the current real-time clock value (microsecond accurate).
Stops operation of the driver. This will cancel any outstanding timer requests (by sending them a premature IPC indicating failure).

The interface is just an internal funciton call interface. You do not need to export this interface to the users. User programs will indirectly access the clock driver through the time and sleep syscalls.

The NSLU2 "Slug"

Slug architecture This is a block diagram of the IXP420 network controller on which the slug is based. The IXP420 is a single chip computer which has an XScale processor and lots of dedicated hardware for I/O. The platform hardware reference manual can be found in [1].

Although the Slug is a very powerful platform with lots of functionality we will only be doing a driver for the 'timers' cell on the "Advanced Peripheral Bus"(APB). If you are interested in the full arcana of device drivers for the other cells particularly the networking stacks then explore the directories libs/ixp_isal and libs/ixp400_xscale_sw, but be warned it is really, really hard to read and understand code.

Timer cell (OSTS)

Your main job is to learn how to program the IXP420's timer cell (OSTS) to generate timer interrupts and how to write an L4 driver to handle these interrupts.

The OSTS contains four functions, 2 general purpose 32 bit timers, a 32 bit time counter and an emergency watchdog timer. For our purposes, we are only interested in the OSTS as our source of timer interrupts. You will be using one of the general purpose timers and the timestamp register. The interrupt vectors for the timers are connected to the 5, 11 & 14 interrupt lines, see nslu2.h for meaningful names.

There is an abundance of different devices and writing device drivers is usually seen as a very difficult task. This is true in a sense. However, programming a device is really just a matter of learning about its registers, what values to read and write to those registers and when to do it.

The minimal subset of the OSTS's functionality that you must understand and use is listed below (the numbers in parentheses are the offsets of the register addresses, in bytes, from the OSTS's base address). Refer to the reference manual for more complete descriptions Chapter 14 of the developer's manual.

  • Time-Stamp Timer (OST_TS 0x00 R/W) : Use this register as the lower 32-bits of your timestamp value.
  • General-Purpose Timer[0-1] (OST_TIMx 0x04&0x0c RO): The actual timer/counters. You need to use at least one of them (take your pick).
  • General-Purpose Timer Reload[0-1] (OST_TIMx_RL 0x08&0x10 R/W): These registers control the behaviour of the general purpose timers.
  • Timer Status (OST_STATUS 0x20 R/W) : When an interrupt occurs, this register tells you which of the many sources of interrupts in the OSTS actually caused it. And lets you clear that interrupt when you have finished processing it.

Again, this is only the minimal understanding that you need. For deeper understanding, you are encouraged to learn about the other registers and can even play with those as well.

NOTE: This section is deliberately kept short (e.g., we do not dictate which timer to use or in what mode to use it in). The idea is that you work things out for yourself and make your own design and implementation decisions. There are only two conditions that must be satisfied:

  1. You must use an interrupt generated from the OSTS.
  2. You must implement the driver interface described above.

Pistachio/ARM Interrupts

The Pistachio/ARM kernel exports specific interrupts to a user level interrupt handler via IPC. User level threads are associated with interrupts by the privileged task using the ThreadControl call. After that, any interrupts of the registered number will get sent as an IPC to the interrupt handler thread.

Refer to the L4 reference manual for further information on the interrupt registration and delivery protocol, in sections 2.4 ThreadControl and 7.2 Interrupt Protocol respectively.

Device Mappings

In Pistachio/ARM, device registers are memory mapped. That is, hardware registers can be accessed via normal load/store operations to special addresses. To access device registers, you must first map the memory with the appropriate attributes, see Chapter 4 in the L4 reference manual. NB that all accesses device registers must bypass the cache. When requesting a device map you must specify that you want unchached mappings.


You may need to resolve some or all of these issues:

  • At what address do the OSTS registers need to be mapped and accessed through?
  • What value must be the timer be programmed to to get a frequency of x milliseconds?
  • How are the interrupts acknowledged?
  • Single or multi-threaded driver?
  • Which data structures should I use?


The file clock.h, as well as another header file you might find useful, can be found in the libs/clock directory.


Once you have a working clock driver, and have verified it is timing accurately, you must use it to benchmark your filesystem I/O performance.

The most obvious (and required) metric is achieved bandwidth for reads and writes. This is simply the maximum number of bytes per second you can transfer from the remote host to your user address space.

When benchmarking you may find it useful to measure other costs such as latency, copying costs, etc. You may find the gnuplot utility useful.

Some things you may like to think about while you're benchmarking are:

  • What are you trying to measure?
  • What are you actually measuring?
  • How much do the measurements themselves affect the timing? (eg. copying/printing)
  • What is the variance?
  • What are the cache effects (memory, network, disk)?
  • How does read vs. write vs. read/write behave?

Feel free to discuss benchmarking methodologies etc. and post interim results on the Wiki.


Should be able to show some test code that uses all the functions specified in the driver interface.

You should also show some user level test code that uses the time_stamp and sleep system calls. You may find it useful to extend sosh to have a time and sleep command. The sleep implemention must use your clock driver.

You will need to present at least two graphs to your tutor. The two required graphs show bandwidth for file I/O while varying I/O request size and underlying NFS packet size (for appropriately large I/O requests). You must be able to explain why the graphs look the way they do. Your numbers must be taken using your timer driver. You can also show more (pertinent) graphs if you like.

NOTE: The achieved bandwidth on the graph will not be compared group-to-group. However, clearly better performance shows a better filesystem implementation. Your tutor is interested in your ability to explain what & how you measured.

Last modified: 10 Sep 2007.