COMP9242 2017/S2

UNSW
CRICOS Provider
Number: 00098G

seL4 Debugging Guide

When developing an operating system on top of seL4 you do not have the luxury of using a debugger such as gdb. Your best bet is a combination of dprintf and objdump. If you are doing the project on a Linux machine (using our linux toolchain) you can also use the backtrace function, combined with a script for parsing objdump output to obtain a primitive stack trace.

Debugging Faults

All exceptions result in a context switch to the seL4 kernel. The kernel exports fault handling policies to user level by sending an IPC to the appropriate IPC endpoint (see Chapter 5 of the seL4 manual). For applications such as tty_test, the assigned endpoint will usually be the IPC endpoint that SOS waits on in the syscall_loop. SOS on the other hand is initialised with no fault endpoint and hence is expected to ensure that it does not cause a fault.

SOS faults

Below is an example of a page fault that is caused by SOS:

388 int main(void) {           
389     
390     dprintf(0, "\nSOS Starting...\n");
391     *((char*)0xDEADBEEF) = 0;

SOS Starting...                                                                 
Caught cap fault in send phase at address 0x0                                   
while trying to handle:                                                         
vm fault on data at address 0xdeadbeef with status 0xf5                         
in thread 0xf0063e00 at address 0x8ae4

A capability fault occurs while sending the fault IPC to the fault handler because the capability address of SOS's fault endpoint is 0x0 (seL4_CapNull). 0x0 does not address a valid synchronous endpoint in the capability space of the faulting thread (SOS). From the provided information, we also see that SOS caused a vm fault. The fault address register (DFAR) was set to 0xdeadbeef while the fault status register (FSR) was set to 0xf5 (see Sections B4.1.51 and B4.1.52 of the ARMv7 manual). Lastly, and most importantly, we see the value of the instruction pointer at the time of the fault is 0x8ae4. objdump may then be used to determine the context of fault.

Application faults

When an application thread is created, an appropriate fault handler IPC endpoint should be set in the thread's TCB (see Chapter 5 of the seL4 manual). If the application causes a fault and IPC will be sent to the assigned fault endpoint. Message registers provide important information regarding the cause and location of the fault. objdump may then be used to determine the context of fault.

Using `objdump`

Whenever you cause a fault, you should be able to see the current program counter printed out. You can use objdump to find where the offending code is.

% arm-none-linux-gnueabi-objdump -Dlx stage/arm/imx6/bin/sos  | less

You can now use the searching facility in less to search for the faulting address. In this case I find the following fragment of output:

/tmp_amd/ravel/export/ravel/2/akroh/aos-2012/apps/sos/src/main.c:391
    8adc:       e3a02000        mov     r2, #0
    8ae0:       e59f367c        ldr     r3, [pc, #1660] ; 9164 
    8ae4:       e5432010        strb    r2, [r3, #-16]

We can see that the error has occurred on code at line 391 of main.c. If the file name and line number does not appear in your dump, run make menuconfig, select Emit debugging information in the Toolchain Options menu and recompile.

If the fault occurs in an application (such as tty_test or sosh) rather than SOS, the following command should be used:

% arm-none-linux-gnueabi-objdump -Dlx stage/arm/imx6/bin/appname  | less

More on `objdump`

objdump is a very handy utility for working out exactly what is where in an executable so you can work out what exactly is going wrong.

The two standard incantations for objdump are:

% arm-none-linux-gnueabi-objdump -dl my_elf.file | less

and

% arm-none-linux-gnueabi-objdump -lx my_elf.file | less

The first command (-dl) disassembles the text segment and shows you all the instructions and at what address. Using this information you can find out things such as:

What does my function compile to?
eg. Am I compiling the right function? Do the instructions make sense?
Why is this instruction crashing?
Look at the fsr for the type of fault, and use objdump to find the exact instruction causing the fault.

NB: In case it's not yet obvious, you will need to get up to speed on your ARM assembly and not be afraid to get your hands dirty if you want to minimise the time you spend debugging.

The second objdump command (-lx) is useful for when addresses appear inside an object file but outside of the text segment. This is especially useful when debugging ELF loading. The -lx option displays section and symbol information. Further options can be added to dump data segments etc. man objdump is your friend.

Generating a Stack Trace

Warning: Backtrace only supported for the Linux toolchain

The backtrace() function is called by default by assert() and conditional_panic(). You can also call it from within any arbitrary place in your code. Backtrace will traverse the stack and print out the addresses of each function in the call stack (as long as you haven't done anything to the stack). This can be useful if you need to know what your code is doing.

Backtrace output looks like this:

Assertion failed: 0, function _sos_init, file /home/qiang/aos-2012/apps/sos/src/main.c, line 385.
seL4 root server aborted
Backtracing stack PCs:  0xb06c  0xbe88  0x9124  <------------------------------the back trace
Debug halt syscall from user thread 0xf0063e00
halting...

We have also provided a script to parse the objdump output and lookup the addresses in the stack trace and show the assembler and C code that they correspond to. Copy the script from ~cs9242/public_html/17/project/files/backtrace.sh into the root directory of your project and execute as follows:

USAGE:    backtrace.sh [OPTIONS] <<EOF
           <pc_addr0> <pc_addr1> <pc_addr2> <pc_addr3> ...
           <<EOF

OPTIONS:  -l printout line number for disassembly
          -S printout source code for disassembly
          -w WIDTH extra lines of disassembly for each side of the target PC, between 0 and 32 (default is 4)
          --start-address=0xADDR only process data whose address is >= ADDR
          --stop-address=0xADDR  only process data whose address is <= ADDR
          --help printout this message

Format:   The PCs(Program Counter Values) should be space or newline separated
          hex numbers, starting with "0x"

Finally, here is an example usage of the backtrace script:

./backtrace.sh -w 1
0xbdd0  0xe418  0x9e50  0x9ec8  0xa10c  0xa8b4  0x9488  
<EOF> (On Linux, this is done by pressing Ctrl-D)

This should output something like:

--------------------------------------------------------------------------------
    # 0  0x0000bdd0 in  abort                         

    bdcc:   ebffffb0    bl  bc94 <backtrace>
    bdd0:   e2506000    subs    r6, r0, #0
    bdd4:   0a000010    beq be1c <abort+0x98>
--------------------------------------------------------------------------------
    # 1  0x0000e418 in calloc                        

    e414:   ebfff65a    bl  bd84 <abort>

0000e418 <calloc>:
    e418:   e92d4038    push    {r3, r4, r5, lr}
    e41c:   e0050190    mul r5, r0, r1
--------------------------------------------------------------------------------
    # 2  0x00009e50 in  epit_enable                   

    9e4c:   eb001162    bl  e3dc <__assert>
    9e50:   e28dd00c    add sp, sp, #12
    9e54:   e8bd8000    ldmfd   sp!, {pc}
--------------------------------------------------------------------------------
    # 3  0x00009ec8 in  epit_setup_timer              

    9ec4:   ebffffc7    bl  9de8 <epit_enable>
    9ec8:   e3040538    movw    r0, #17720  ; 0x4538
    9ecc:   e3400002    movt    r0, #2
--------------------------------------------------------------------------------
    # 4  0x0000a10c in  epit_init                     

    a108:   ebffff69    bl  9eb4 <epit_setup_timer>
    a10c:   e30405f0    movw    r0, #17904  ; 0x45f0
    a110:   e3400002    movt    r0, #2
--------------------------------------------------------------------------------
    # 5  0x0000a8b4 in  start_timer                   

    a8b0:   ebfffdf9    bl  a09c <epit_init>
    a8b4:   e2850028    add r0, r5, #40 ; 0x28
    a8b8:   e1a01004    mov r1, r4
--------------------------------------------------------------------------------
    # 6  0x00009488 in  main                          

    9484:   eb0004ff    bl  a888 <start_timer>
    9488:   e30805e0    movw    r0, #34272  ; 0x85e0
    948c:   e3400003    movt    r0, #3

Last modified: 02 Aug 2017.