Advanced Operating Systems
COMP9242 2002/S2

Printer-Friendly Version

Administration
- Notices
- Course Intro
- Consultations
# On-line Survey (closed)
- Survey Results

Work
- Lectures
- Milestone 0
- Project Admin
- Project Spec
- Project FAQ
- Exam

Documentation
- ASysT Lab
- L4 source browser
- Sulima ISA Simulator
- R4x00 ISA Summary
- MIPS R4700 Reference - MIPS R4000 User Manual
- Network Driver
- GT64111

Related Info
- Aurema OS Prize
- OS Hall of Fame

History
- 2000
- 1999
- 1998

Staff
- Gernot Heiser (LiC)

Next: A Critique of the Up: 05-uk Previous: 05-uk

Subsections

Critique of
Microkernel Architectures

I'm not interested in making devices look like user-level. They aren't, they shouldn't, and microkernels are just stupid.

Linus Torvalds

Is Linus right?

Microkernel Performance

First generation -kernel systems exhibited poor performance when compared to monolithic UNIX implementations.
- particularly Mach, the best-known example
Reasons are investigated by [Chen & Bershad 93]:
- instrumented user and system code to collect execution traces
- run on DECstation 5000/200 (25MHz R3000)
- run under Ultrix and Mach with Unix server
- traces fed to memory system simulator
- analyse MCPI (memory cycles per instruction)
  - baseline MCPI (i.e. excluding idle loops)

Interpretation

Observations:

Mach memory penalty (i.e. cache missess or write stalls) higher
Mach VM system executes more instructions than Ultrix
(but has more functionality).

Claim:

Degraded performance is (intrinsic?) result of OS structure.
IPC cost (known to be high in Mach) is not a major factor[Ber92].

Assertions

1

OS has less instruction and data locality than user code.

System code has higher cache and TLB miss rates.
Particularly bad for instructions.

2

System execution is more dependent on instruction cache behaviour than is user execution

MCPIs dominated by system i-cache misses.

Note: most benchmarks were small, i.e. user code fits in cache.

3

Competition between user and system code is not a problem

Few conflicts between user and system caching.
TLB misses are not a relevant factor

Note: the hardware used has direct-mapped physical caches.
==> Split system/user caches wouldn't help.

Only examine system cache misses.

Shaded: System cache misses removed by associativity.

MCPI for system-only, using R3000 direct-mapped cache.

Reductions due to associativity were obtained by running system on a simulator and using a two-way associative cache of the same size.

Assertions...

4

Self-interference is a problem in system instruction reference streams.

High internal conflicts in system code.
System would benefit from higher cache associativity.

5

System block memory operations are responsible for a large percentage of memory system reference costs.

Particularly true for I/O system calls.

6

Write buffers are less effective for system references.

write buffer allows limited asynch. writes on cache misses

7

Virtual to physical mapping strategy can have significant impact on cache performance

Unfortunate mapping may increase conflict misses.
``Random'' mappings (Mach) are to be avoided.

Other experience with $\mu$ -kernel performance

System call costs are (inherently?) high.
- Typically hundreds of cycles, 900 for Mach/i486.
Context (address-space) switching costs are (inherently?) high.
- Getting worse (in terms of cycles) with increasing CPU/memory speed ratios[Ous90].
- IPC (involving system calls and context switches) is inherently expensive.

So, what's wrong?

$\mu$ -kernels heavily depend on IPC
IPC is expensive
- Is the $\mu$ -kernel idea flawed?
- Should some code never leave the kernel?
- Do we have to buy flexibility with performance?

Next: A Critique of the Up: 05-uk Previous: 05-uk

Gernot Heiser 2002-08-28