GDB - Core Dumps

Outcome

Able to enable core dumps and debug them using GDB.

Introduction

When a program crashes, the memory and some state information at that point can be placed in a core dump file. A common cause of a core dump generation is a segmentation fault, which is caused by attempting to access an illegal memory location. This can include use after free, buffer overflow, and dereferencing the NULL pointer. GDB can be used to read the core dump file and view some useful information about the program when it crashed.

Dumping core is also very useful in situations where faults occur intermittently. It allows you to inspect what might have happened even in situations where the fault is difficult to trigger.

Applicable subjects

COMP1521, COMP2521


Core Dump Settings

To enable core dumps, first check the maximum core dump size:

$ ulimit -c

If the result of this is zero (i.e. no core dump will be produced), set the limit to the maximum:

$ ulimit -c unlimited

A core dump will now be generated and placed in the location specified by /proc/sys/kernel/core_pattern. Check this location by running:

$ cat /proc/sys/kernel/core_pattern

On CSE systems (and many other systems), the default settings result in the output:

core

This means that any core dumps will be placed in the current directory in a file named core.

You can change this location using:

$ echo "<desired-file-path>/<desired-file-name>" > /proc/sys/kernel/core_pattern

Generating a Core Dump

Compile the code for use with GDB.

$ gcc -g <any other flags> -o file_name file_name.c

Run the program as normal

$ ./<file_name>
Segmentation fault (core dumped)

An error message like the one above should appear if the program crashes.

Starting a GDB session

Start a GDB session with the program binary and coredump file

$ gdb <binary-file> <core-dump-file>

GDB is helpful to inspect the stack frame and the state of variables and registers when the program crashed. Commands such as where, up, down, print, info locals, info args, info registers and list can be helpful in this situation.

It is useful to remember that, while debugging core dumps, the program is not actually running, so commands related to the execution of the program such as step, next and continue are unavailable.

Coredumps and WSL

Core dumps are currently unavailable on WSL (see the github issue for more details). An alternative option is to run the program in gdb and have it crash (which provides access to similar commands such as where and info locals).


Example

In this example, we will be debugging a code that creates a linked list then prints it out. During the execution of the code, however, a segmentation fault is generated. We will inspect the corresponding core dump to determine the source of the bug.

broken_linked_list.c

broken_linked_list.c
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
//Makes a linked list of length 7 and prints it out
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>


struct node {
    int data;
    struct node *next;
};

struct node *create_node(int data);
struct node *create_list(int length);
void print_list(struct node *list, int length);

int main(void){
    int length1 = 7;
    struct node *list1 = create_list(length1);
    print_list(list1, length1);

    return 0;
}

struct node *create_node(int data){
    struct node *new = malloc(sizeof(struct node));
    assert(new != NULL);
    new->data = data;
    new->next = NULL;
    return new;
}

struct node *create_list(int length) {

    struct node *head = NULL;
    if (length > 0) {
        head = create_node(0);
        int i = 1;
        struct node *curr = head;
        while (i < length) {
            curr->next = create_node(i);
            curr = curr->next;
            i++;
        }
    }
    return head;
}

void print_list(struct node *list, int length){
    struct node *curr = list;
    int i = 0;
    while (i <= length) {
        printf("%d->", curr->data);
        curr = curr->next;
        i++;
    }
    printf("X\n");
}

Note

It is assumed that you have the knowledge introduced in the Basic Use, Breakoints, Viewing Data and Navigating Your Program modules.

When the program above is compiled and run, the following output is produced:

$ gcc -g -o broken_linked_list broken_linked_list.c
$ ./broken_linked_list
Segmentation fault (core dumped)

This ouput means that the program crashed because it accessed a part of memory that it is not allowed to.

First, we want to find the line that it crashed on. There should now be a file called core inside the current directory (if not, see the Core Dump Settings section).

Start a GDB session for the core dump.

$ gdb broken_linked_list core

Immediately, GDB will output the line it crashed on.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055be9593e283 in print_list (list=0x55be96c20260, length=7)
    at broken_linked_list.c:51
51          printf("%d->", curr->data);

We now know from this output that invalid memory was accessed on line 51, so we look at the memory that is accessed on that line. There is one memory access on this line curr->data, so we are either not allowed to dereference curr or we are not allowed to read data. Let’s print out the current pointer.

(gdb) print curr
$1 = (struct node *) 0x0

We know that we are not allowed to dereference the NULL (zero) pointer so we have found why our program has segfaulted. However, we are not sure about why it is dereferencing a NULL pointer. Let’s look at the local variables and see if they hold any clues.

(gdb) info locals
curr = 0x0
i = 7

When the program crashed, i is 7, which means it is on the 8th iteration of the loop. Our linked list is only 7 nodes long so it should never reach ‘node 8’. If we have constructed our linked list correctly the ‘8th node’ is a NULL pointer.

Let’s check out some variables, such as the arguments passed into the fuctions. ::

(gdb) info args list = 0x55be96c20260 length = 7

Our linked list is indeed 7 nodes long, and we can check that list is a valid pointer by printing the dereferenced struct.

(gdb) print *list
$2 = {data = 0, next = 0x55be96c20280}

We know the arguments are correct, so the issue must be inside the function.

We can use list to look at the code around the current line.

(gdb) list
46
47  void print_list(struct node *list, int length){
48      struct node *curr = list;
49      int i = 0;
50      while (i <= length) {
51          printf("%d->", curr->data);
52          curr = curr->next;
53          i++;
54      }
55      printf("X\n");

An off by one error is common and would cause the while loop to go for one more or one less loop than desired. Line 50 stops the loop when i is greater than length (i.e. when i = 8). We want to exit the loop when i = 7, so this is most likely causing our issues.

Looking at the code, we may realise that, not only is there an off by one error, but there is a better way to traverse a linked list to its end. This is achieved by ending the loop when a NULL is reached. This adds some protection against an incorrect length passed in.

We fix this code with the new function and no more segfault! ::

$ ./linked_list.c 0->1->2->3->4->5->6->X

linked_list.c
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
//Makes a linked list of length 7 and prints it out
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>


struct node {
    int data;
    struct node *next;
};

struct node *create_node(int data);
struct node *create_list(int length);
void print_list(struct node *list);

int main(void){
    struct node *list1 = create_list(7);
    print_list(list1);

    return 0;
}

struct node *create_node(int data){
    struct node *new = malloc(sizeof(struct node));
    assert(new != NULL);
    new->data = data;
    new->next = NULL;
    return new;
}

struct node *create_list(int length) {

    struct node *head = NULL;
    if (length > 0) {
        head = create_node(0);
        int i = 1;
        struct node *curr = head;
        while (i < length) {
            curr->next = create_node(i);
            curr = curr->next;
            i++;
        }
    }
    return head;
}

void print_list(struct node *list){
    struct node *curr = list;

    while (curr != NULL) {
        printf("%d->", curr->data);
        curr = curr->next;
    }
    printf("X\n");
}

Module author: Liz Willer <e.willer@unsw.edu.au>

Date

2020-01-15