.. _gdb_coredumps: **************** GDB - Core Dumps **************** .. topic:: Outcome Able to enable core dumps and debug them using GDB. .. topic:: Introduction When a program crashes, the memory and some state information at that point can be placed in a core dump file. A common cause of a core dump generation is a segmentation fault, which is caused by attempting to access an illegal memory location. This can include use after free, buffer overflow, and dereferencing the NULL pointer. GDB can be used to read the core dump file and view some useful information about the program when it crashed. Dumping core is also very useful in situations where faults occur intermittently. It allows you to inspect what might have happened even in situations where the fault is difficult to trigger. .. topic:: Applicable subjects COMP1521, COMP2521 ---- Core Dump Settings ====================== To enable core dumps, first check the maximum core dump size: :: $ ulimit -c If the result of this is zero (i.e. no core dump will be produced), set the limit to the maximum: :: $ ulimit -c unlimited A core dump will now be generated and placed in the location specified by /proc/sys/kernel/core_pattern. Check this location by running: :: $ cat /proc/sys/kernel/core_pattern On CSE systems (and many other systems), the default settings result in the output: :: core This means that any core dumps will be placed in the current directory in a file named **core**. You can change this location using: :: $ echo "/" > /proc/sys/kernel/core_pattern Generating a Core Dump ======================== Compile the code for use with GDB. :: $ gcc -g -o file_name file_name.c Run the program as normal :: $ ./ Segmentation fault (core dumped) An error message like the one above should appear if the program crashes. Starting a GDB session ======================= Start a GDB session with the program binary and coredump file :: $ gdb GDB is helpful to inspect the stack frame and the state of variables and registers when the program crashed. Commands such as *where*, *up*, *down*, *print*, *info locals*, *info args*, *info registers* and *list* can be helpful in this situation. It is useful to remember that, while debugging core dumps, the program is not actually running, so commands related to the execution of the program such as *step*, *next* and *continue* are unavailable. Coredumps and WSL ================= Core dumps are currently *unavailable* on WSL (see the `github issue `_ for more details). An alternative option is to run the program in gdb and have it crash (which provides access to similar commands such as where and info locals). ---- Example ======== In this example, we will be debugging a code that creates a linked list then prints it out. During the execution of the code, however, a segmentation fault is generated. We will inspect the corresponding core dump to determine the source of the bug. :download:`broken_linked_list.c` .. literalinclude:: broken_linked_list.c :language: c :linenos: :caption: broken_linked_list.c .. note:: It is assumed that you have the knowledge introduced in the Basic Use, Breakoints, Viewing Data and Navigating Your Program modules. When the program above is compiled and run, the following output is produced: :: $ gcc -g -o broken_linked_list broken_linked_list.c $ ./broken_linked_list Segmentation fault (core dumped) This ouput means that the program crashed because it accessed a part of memory that it is not allowed to. First, we want to find the line that it crashed on. There should now be a file called *core* inside the current directory (if not, see the Core Dump Settings section). Start a GDB session for the core dump. :: $ gdb broken_linked_list core Immediately, GDB will output the line it crashed on. :: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000055be9593e283 in print_list (list=0x55be96c20260, length=7) at broken_linked_list.c:51 51 printf("%d->", curr->data); We now know from this output that invalid memory was accessed on line 51, so we look at the memory that is accessed on that line. There is one memory access on this line curr->data, so we are either not allowed to dereference curr or we are not allowed to read data. Let's print out the current pointer. :: (gdb) print curr $1 = (struct node *) 0x0 We know that we are not allowed to dereference the NULL (zero) pointer so we have found why our program has segfaulted. However, we are not sure about why it is dereferencing a NULL pointer. Let's look at the local variables and see if they hold any clues. :: (gdb) info locals curr = 0x0 i = 7 When the program crashed, i is 7, which means it is on the 8th iteration of the loop. Our linked list is only 7 nodes long so it should never reach 'node 8'. If we have constructed our linked list correctly the '8th node' is a NULL pointer. Let's check out some variables, such as the arguments passed into the fuctions. :: (gdb) info args list = 0x55be96c20260 length = 7 Our linked list is indeed 7 nodes long, and we can check that list is a valid pointer by printing the dereferenced struct. :: (gdb) print *list $2 = {data = 0, next = 0x55be96c20280} We know the arguments are correct, so the issue must be inside the function. We can use *list* to look at the code around the current line. :: (gdb) list 46 47 void print_list(struct node *list, int length){ 48 struct node *curr = list; 49 int i = 0; 50 while (i <= length) { 51 printf("%d->", curr->data); 52 curr = curr->next; 53 i++; 54 } 55 printf("X\n"); An off by one error is common and would cause the while loop to go for one more or one less loop than desired. Line 50 stops the loop when i is greater than length (i.e. when i = 8). We want to exit the loop when i = 7, so this is most likely causing our issues. Looking at the code, we may realise that, not only is there an off by one error, but there is a better way to traverse a linked list to its end. This is achieved by ending the loop when a NULL is reached. This adds some protection against an incorrect length passed in. We fix this code with the new function and no more segfault! :: $ ./linked_list.c 0->1->2->3->4->5->6->X .. literalinclude:: linked_list.c :language: c :linenos: :caption: linked_list.c .. moduleauthor:: Liz Willer :Date: 2020-01-15