[CSE]  Advanced Operating Systems 
 COMP9242 2010/S2 
UNSW
CRICOS Provider
Number: 00098G

PRINTER Printer-Friendly Version
Administration                        
- Notices
- Course Intro
- Times
- Lecture location/time
- Statistics
- Survey Results
 
Work
- Lectures
- Selected Papers
- Project Spec
- Exam
 
Support
- Forums
- Wiki
 
Resources
- Project Resources
- Slug Lab
- L4 Debugging Guide
- Developing on a Mac
- Developing on Linux
- SOS source browser
 
Documentation
- OKL4 reference manual
- Elfweaver user manual
- IXP42X hardware manual
- OKL Wiki
- NSLU2-Linux HomePage
- Intel IXP400 Software
 
Related Info
- IBM OS Prize
- OS Hall of Fame
 
History
- 2009
- 2008
- 2007
- 2006
- 2005
- 2004
- 2003
- 2002
- 2000
- 1999
- 1998
 
Staff
- Gernot Heiser
- Kevin Elphinstone (LiC)
- Guest Lecturers (TBA)
 
Stureps
- Student Reps

 
Valid HTML 4.0!

M5: A timer driver and benchmarking

In this milestone you will implement a timer driver, and the two timer-related system calls. The timer-related system calls must be implemented using your driver. Using your timer driver you will then benchmark file I/O performance.

Goals

  • Learn and understand the fundamentals of writing a device driver.
  • Play with real hardware.
  • Learn about interrupt handling in L4.
  • Time system behaviour on L4.

Motivation

For many purposes, such as, benchmarking, a high clock resolution is desired. The NSLU2/IXP420 platform features a timer cell which contains several high-frequency counters which can are used as timers. By measuring the performance of your file system you will gain an understanding of how to measure a system's code. Benchmarking is also a great way to test stability.

The Driver Interface

Your driver needs to export the interface specified in clock.h. There are the following functions:

start_timer
Initialises the driver.
register_timer
Registers the calling thread to receive an IPC message after a specified time interval (in microseconds, though actual wakeup resolution will depend on the timer resolution). Several registrations may be pending at any time.

The IPC message sent by the timer contains in MR0 a return code (TIMER_R_OK, TIMER_R_UINT, TIMER_R_CNCL) and in MR1 and MR2 the present value of the real-time clock.

time_stamp
Returns the current real-time clock value (microsecond accurate).
stop_timer
Stops operation of the driver. This will cancel any outstanding timer requests (by sending them a premature IPC indicating failure).

The interface is just an internal function call interface. You do not need to export this interface to the users. User programs will indirectly access the clock driver through the time and sleep syscalls.

The NSLU2 "Slug"

Slug architecture This is a block diagram of the IXP420 network controller on which the slug is based. The IXP420 is a single chip computer which has an XScale processor and lots of dedicated hardware for I/O. The platform hardware reference manual can be found in [1].

Although the Slug is a very powerful platform with lots of functionality we will only be doing a driver for the 'timers' cell on the "Advanced Peripheral Bus"(APB). If you are interested in the full arcana of device drivers for the other cells particularly the networking stacks then explore the directories libs/ixp_isal and libs/ixp400_xscale_sw, but be warned it is really, really hard to read and understand code.

Timer cell (OSTS)

Your main job is to learn how to program the IXP420's timer cell (OSTS) to generate timer interrupts and how to write an L4 driver to handle these interrupts.

The OSTS contains four functions, 2 general purpose 32 bit timers, a 32 bit time counter and an emergency watchdog timer. For our purposes, we are only interested in the OSTS as our source of timer interrupts. You will be using one of the general purpose timers and the timestamp register. The interrupt vectors for the timers are connected to the 5, 11 & 14 interrupt lines, see nslu2.h for meaningful names.

There is an abundance of different devices and writing device drivers is usually seen as a very difficult task. This is true in a sense. However, programming a device is really just a matter of learning about its registers, what values to read and write to those registers and when to do it.

The minimal subset of the OSTS's functionality that you must understand and use is listed below (the numbers in parentheses are the offsets of the register addresses, in bytes, from the OSTS's base address). Refer to the reference manual for more complete descriptions Chapter 14 of the developer's manual.

  • Time-Stamp Timer (OST_TS 0x00 R/W) : Use this register as the lower 32-bits of your timestamp value.
  • General-Purpose Timer[0-1] (OST_TIMx 0x04&0x0c RO): The actual timer/counters. You need to use at least one of them (take your pick).
  • General-Purpose Timer Reload[0-1] (OST_TIMx_RL 0x08&0x10 R/W): These registers control the behaviour of the general purpose timers.
  • Timer Status (OST_STATUS 0x20 R/W) : When an interrupt occurs, this register tells you which of the many sources of interrupts in the OSTS actually caused it. And lets you clear that interrupt when you have finished processing it.

Again, this is only the minimal understanding that you need. For deeper understanding, you are encouraged to learn about the other registers and can even play with those as well.

NOTE: This section is deliberately kept short (e.g., we do not dictate which timer to use or in what mode to use it in). The idea is that you work things out for yourself and make your own design and implementation decisions. There are only two conditions that must be satisfied:

  1. You must use an interrupt generated from the OSTS.
  2. You must implement the driver interface described above.

OKL4/ARM Interrupts

The OKL4/ARM kernel exports specific interrupts to a user level interrupt handler via asynchronous notification. User level threads are given permission to associate themselves with interrupts by the privileged task using the SecurityControl call. After that, the thread may call InterruptControl to associate itself with an interrupt. Thereafter, any interrupts of the registered number will get sent as an async notification to the interrupt handler thread.

Refer to the OKL4 reference manual for further information on the interrupt registration and delivery protocol, in sections A-11.4 InterruptControl, A-11.12 SecurityControl, and A-7 Interrupts. A-6.4 Sending Asynchronous Notification Messages may also be useful.

Device Mappings

In OKL4/ARM, device registers are memory mapped. That is, hardware registers can be accessed via normal load/store operations to special addresses. To access device registers, you must first map the memory with the appropriate attributes, see Section A-11.6 in the OKL4 reference manual. NB that all accesses device registers must bypass the cache. When requesting a device map you must specify that you want uncached mappings.

Issues

You may need to resolve some or all of these issues:

  • At what address do the OSTS registers need to be mapped and accessed through?
  • What value must be the timer be programmed to to get a frequency of x milliseconds?
  • How are the interrupts acknowledged?
  • Single or multi-threaded driver?
  • Which data structures should I use?

Files

The file clock.h, as well as another header file you might find useful, can be found in the libs/clock directory.

Benchmarking

Once you have a working clock driver, and have verified it is timing accurately, you must use it to benchmark your filesystem I/O performance.

The most obvious (and required) metric is achieved bandwidth for reads and writes. This is simply the maximum number of bytes per second you can transfer from the remote host to your user address space.

When benchmarking you may find it useful to measure other costs such as latency, copying costs, etc. You may find the gnuplot utility useful.

Some things you may like to think about while you're benchmarking are:

  • What are you trying to measure?
  • What are you actually measuring?
  • How much do the measurements themselves affect the timing? (eg. copying/printing)
  • What is the variance?
  • What are the cache effects (memory, network, disk)?
  • How does read vs. write vs. read/write behave?

Feel free to discuss benchmarking methodologies etc. and post interim results on the forum.

Assessment

Should be able to show some test code that uses all the functions specified in the driver interface.

You should also show some user level test code that uses the time_stamp and sleep system calls. You may find it useful to extend sosh to have a time and sleep command. The sleep implemention must use your clock driver.

You will need to present at least two graphs to your tutor. The two required graphs show bandwidth for file I/O while varying I/O request size and underlying NFS packet size (for appropriately large I/O requests). You must be able to explain why the graphs look the way they do. Your numbers must be taken using your timer driver. You can also show more (pertinent) graphs if you like.

NOTE: The achieved bandwidth on the graph will not be compared group-to-group. However, clearly better performance shows a better filesystem implementation. Your tutor is interested in your ability to explain what & how you measured.


Last modified: 05 Jul 2010.