

School of Computer Science & Engineering

**COMP9242 Advanced Operating Systems** 

2019 T2 Week 09b Local OS Research @GernotHeiser



# **Copyright Notice**

# These slides are distributed under the Creative Commons Attribution 3.0 License

- You are free:
  - to share—to copy, distribute and transmit the work
  - to remix-to adapt the work
- under the following conditions:
  - Attribution: You must attribute the work (but not in any way that suggests that the author endorses you or your use of the work) as follows:

"Courtesy of Gernot Heiser, UNSW Sydney"

The complete license text can be found at http://creativecommons.org/licenses/by/3.0/legalcode



# Quantifying Security Impact of Operating-System Design



# Quantifying OS-Design Security Impact

#### Approach:

- Examine all *critical* Linux CVEs (vulnerabilities & exploits database)
  - easy to exploit
  - high impact
  - no defence available
  - confirmed



• For each establish how microkernel-based design would change impact







# **Sel4** Hypothetical Security-Critical App



App requires:

- IP networking
- File storage
- Display output

5 COMP9242 2019T2 W09b: Local OS Research























# All Critical Linux CVEs to 2017





#### Summary

#### **OS structure matters!**

- Microkernels definitely improve security
- Monolithic OS design is *fundamentally flawed from security point of view*

[Biggs et al., APSys'18]

Use of a monolithic OS in security- or safetycritical scenarios is professional malpractice!





# Cogent



#### Beyond the Kernel



#### Aim: Verified TCB at affordable cost!



### Cogent: Code & Proof Co-Generation

Aim: Reduce cost of verified systems code

- Restricted, purely functional systems language
- Type- and memory safe, not managed
- Turing incomplete
- File system case-studies: BilbyFs, ext2, F2FS, VFAT

[O'Connor et al, ICFP'16; Amani et al, ASPLOS'16]





#### Manual Proof Effort

| BilbyFS<br>functions          | Effort  | lsabelle<br>LoP | Cogent<br>SLoC | Cost<br>\$/SLoC | LoP/<br>SLOC |  |
|-------------------------------|---------|-----------------|----------------|-----------------|--------------|--|
| isync()/<br>iget()<br>library | 9.25 pm | 13,000          | 1,350          | 150             | 10           |  |
| sync()-<br>specific           | 3.75 pm | 5,700           | 300            | 260             | 19           |  |
| iget()-<br>specific           | 1 pm    | 1,800           | 200            | 100             | 9            |  |
| seL4                          | 12 py   | 180,000         | 8,700 C        | 350             | 20           |  |
|                               |         |                 |                |                 |              |  |

BilbyFS: 4,200 LoC Cogent



#### Addressing Verification Cost



#### **Dependability-cost tradeoff:**

- Reduced faults through safe language
- Property-based testing (QuickCheck)
- Model checking

•

Full functional correctness proof

Spec reuse!

#### Work in progress:

- Language expressiveness
- Reduce boiler-plate code
- Network stacks
- Device drivers



# **Time Protection**



### Refresh: Microarchitectural Timing Channels



Contention for shared hardware resources affects execution speed, leading to timing channels



#### OS Must Enforce Time Protection



#### **Preventing interference is core duty of the OS!**

- *Memory protection* is well established
- *Time protection* is completely absent



#### Time Protection: No Sharing of HW State







22 | COMP9242 2019T2 W09b: Local OS Research



## Sel4 Spatial Partitioning: Cache Colouring





# Sel4 Channel Through Kernel Code



Channel matrix: Conditional probability of observing output signal (time) given input signal (system-call number)







### Sel4 Spatial Partitioning: Cache Colouring





# Sel4 Channel Through Kernel Code





# Sel4 Temporal Partitioning: Flush on Switch

Must remove any history dependence!

- 2. Switch user context
- 3. Flush on-core state

- 6. Reprogram timer
- 7. return















# Sel4 Temporal Partitioning: Flush on Switch

Must remove any history dependence!











# Sel4 Performance Impact of Colouring

Splash-2 benchmarks on Arm A9





### A New HW/SW Contract

For all shared microarchitectural resources:

- 1. Resource must be spatially partitionable or flushable
- 2. Concurrently shared resources must be spatially partitioned
- 4. Mechanisms must be sufficiently specified for OS to partition or reset
- 5. Mechanisms must be constant time, or of specified, bounded latency
- 6. Desirable: OS should know if resettable state is derived from data, instructions, data addresses or instruction addresses

[Ge et al., APSys'18]

aISA: augmented ISA

Cannot share HW threads

across security domains!



## **Sel4** Can Time Protection Be Verified?

- 1. Correct treatment of spatially partitioned state:
  - Need hardware model that identifies all such state (augmented ISA)
  - To prove:
    No two domains can access the same physical state

Functional property!

- 2. Correct flushing of time-shared state
  - Not trivial: eg proving all cleanup code/data are forced into cache after flush
    - Needs an actual cache model
  - Even trickier: need to prove padding is correct
    - ... without explicitly reasoning about time!

Transforms timing channels into storage channels!

Functional property!





# **Sel4** Verifying Time Padding

- Idea: Minimal formalisation of hardware clocks (abstract time)
  - Monotonically-increasing counter
  - Can add constants to time values
  - Can compare time values





# Making COTS Hardware Dependable



#### Satellites: SWaP vs Dependability

Space is becoming commodisized:

- many, small (micro-) satellites
- increasing cost pressure

#### Harsh evironment for electronics:

- temperature fluctuations
- ionising radiation



NCUBE2 by Bjørn Pedersen, NTNU (CC BY 1.0)

Radiation-hardened processors are slow, bulky and expensive

Use redundancy of cheap COTS multicores

38



#### **Traditional Redundancy Approaches**







#### **RCoE: Two Variants**

#### Loosely-coupled RCoE

- Sync on syscalls & exceptions
- Preemptions in usermode not further synchronised (imprecise)
- Low overhead
- Cannot support racy apps, threads, virtual machines

#### **Closely-coupled RCoE**

- Sync on instruction
- Precise preemptions
- High overhead
- Supports all apps
- May need re-compile



## Sel4 Closely-Coupled RCoE Implementation











### Sel4 Performance: SPLASH-2 on x86 VMs

| Name      | Ν    | Base | CC-D | Factor |
|-----------|------|------|------|--------|
| BARNES    | 30   | 61   | 93   | 1.52   |
| CHOLESKY  | 300  | 66   | 792  | 12.08  |
| FFT       | 100  | 64   | 142  | 2.22   |
| FFM       | 20   | 76   | 160  | 2.11   |
| LU-C      | 30   | 64   | 437  | 6.83   |
| LU-NC     | 20   | 62   | 381  | 6.12   |
| OCEAN-C   | 1000 | 64   | 173  | 2.71   |
| OCEAN-NC  | 1000 | 65   | 171  | 2.65   |
| RADIOSITY | 25   | 66   | 75   | 1.12   |
| RADIX     | 20   | 66   | 89   | 1.34   |
| RAYTRACE  | 1000 | 60   | 65   | 1.09   |
| VOLREND   | 100  | 86   | 133  | 1.54   |
| WATER-NS  | 600  | 66   | 92   | 1.41   |
| WATER-S   | 600  | 67   | 84   | 1.25   |



Geometric mean overhead: 2.3×









# Sel4 Performance: Redis on Arm













|               |                   | S                                                      | abre Lite  |    | RAD750                |         |            |
|---------------|-------------------|--------------------------------------------------------|------------|----|-----------------------|---------|------------|
|               | Cores @ clock     | 4 @ 800 MHz<br>4 × 2,000 DMIPS<br>< 5 W<br>200 DMIPS/W |            |    | 1 @ 133 MHz           |         |            |
|               | Performance       |                                                        |            |    | 240 DMIPS             |         |            |
|               | Power             |                                                        |            |    | < 6 W                 | 2002 pr | 2002 price |
| <             | Energy Efficiency |                                                        |            |    | 40 DMIPS/W            |         |            |
|               | Cost              | /                                                      | \$200      |    | \$200,000             |         |            |
| <             | Perf/Cost         |                                                        | 5 DMIPS/\$ | 0. | 0002 DMIPS/\$         | >       |            |
| Assuming 2×   |                   |                                                        |            |    |                       |         |            |
| overhead, TMR |                   |                                                        |            |    | [Shen et al., DSN'19] |         |            |



# **Real-World Use**

49 COMP9242 2019T2 W09b: Local OS Research

© Gernot Heiser 2019 – CC Attribution License













## sel4 Incremental Cyber Retrofit









# **Sel4** Incremental Cyber Retrofit





# Sel4 Issue: Capabilities are Low-Level





### **Sel4** Simple But Non-Trivial System





### **Sel4** Component Middleware: CAmkES

















