Week 03


Week 031/67

Things to Note ...

In This Lecture ...

Coming Up ...


Nerdy Things You Should Know2/67

Consider the following scenario ...

Fear not! This is ... How to speak   #@*%$!   Ascii

From
blog.codinghorror.com/ascii-pronunciation-rules-for-programmers/


... Nerdy Things You Should Know3/67

Symbol Common Name Silliest Name
&
*
"
^
@
!
#
;


Software Development Process4/67

Reminder of how software development runs ...

Typically, iterate over the implementation/testing phases.


... Software Development Process5/67

Tools available to assist each stage ...

For testing, assert and (even) printf are also useful.


Testing


Testing7/67

A systematic process of determining whether a program

Testing requires:


... Testing8/67

Testing happens at different stages in development:

A useful approach for unit testing:


... Testing9/67

Testing alone cannot establish that a program is correct.

Why not?

To show that a program is correct, we need to show

Problem: even small programs have too many possible inputs.

Testing increases our confidence that the program is ok.

Well-chosen tests significantly increase our confidence.


Exercise: Testing a function10/67

Consider the notion that an array int A[N] is ordered:

ordered(A[N]):  forall i:1..N-1,  A[i] >= A[i-1]

and some code to implement this test

ordered = 1; // assume that it *is* ordered
for (i = 1; i < N; i++) {
    if (A[i] >= A[i-1])
        /* ok ... still ordered */;
    else
        ordered = 0; // found counter-example
}
// ordered is set appropriately

Put this code in a function and write a driver to test it.

Print "Yes" if ordered, or print "No" otherwise.


Testing Strategy11/67

Testing only increases our confidence that the program works.

Unfortunately, we tend to make the big assumption:

"It works for my test data, so it'll work for all data"

The only way to be absolutely sure that this is true:

Exhaustive testing is not possible in practice ...


... Testing Strategy12/67

Realistic testing:

If, for each input, the program gives the expected output, then


Developing Test Cases13/67

Examples:

Kind of
Problem
Partitions of input
values for testing
Numeric +value, -value, zero
Text 0, 1, 2, many text elements;
lines of zero and huge length
List 0, 1, 2, many list items
Ordering ordered, reverse ordered
random order, duplicates
Sorting same as for ordering


... Developing Test Cases14/67

Years of (painful) experience have yielded some common bugs:

Choose test cases to ensure that all such bugs will be exercised.

Requires you to understand the "limit points" of your program.


Making Programs Fail15/67

Some techniques that you can use to exercise potential bugs:


Summary: Testing Strategies16/67


Debugging


Debugging18/67

Debugging = process of removing errors from software.

Required when observed output != expected output.

Typically ...

Bug: code fragment that does not satisfy its specification.


... Debugging19/67

Consequences of bugs:


** but if the runtime error is due to pointer mismanagement, you're very unlucky


... Debugging20/67

Debugging has three aspects:

Generally ...

To fix: re-examine spec, modify code to satisfy spec.


... Debugging21/67

The easiest bugs to find:

The most difficult bugs to find: Assumptions are what makes debugging difficult.

Corollary: an onlooker will find the bug quicker than you.


Debugging Strategies22/67

The following will assist in the task of finding bugs:


Debuggers23/67

Debuggers: tools to assist in finding bugs

Typically provide facilities to

Examples:


Exercise: Buggy Program24/67

Spot the bugs in the following simple program:

int main(int argc, char *argv[]) {
	int i, sum;
	int a[] = {7, 4, 3};

	for (i = 1; i <= 3; i++) {
		sum += a[i];
	}
	printf(sum);
	return EXIT_SUCCESS;
}

What will we observe when it's compiled/executed?


The Debugging Process


The Debugging Process26/67

Debugging requires a detailed understanding of program state.

The state of a program comprises:

Simple example, considering just local vars in a function: Note: for any realistic program, the state will be rather large ...


... The Debugging Process27/67

The real difficulty of debugging is locating the bug.

Since a bug is "code with unintended action", you need to know:

In any non-trivial program, the sheer amount of detail is a problem.

The trick to effective debugging is narrowing the focus of attention.

That is, employ a search strategy that enables you to zoom in on the bug.


... The Debugging Process28/67

When you run a buggy program, initially everything is ok.

At some point, the buggy statement is executed ...

The goal: find the point at which the state becomes "corrupted"

Typically, you need to determine which variables "got the wrong value".


Locating the Bug29/67

A simple search strategy for debugging is as follows:


... Locating the Bug30/67

At each stage you have eliminated half of the program from suspicion.

In not too many steps, you will have identified a specific buggy statement.

The problems:

Side note: this approach won't necessarily find an existing bug ...

E.g.     x:int { y = x+x; } y==x2,   when intially   x==2


... Locating the Bug31/67

A slightly smarter strategy, relying on the typical structure of programs:

How to determine strategic points? E.g.


Examining Program State32/67

A vital tool for debugging is a mechanism to display state.

One method: diagnostic printf statements of "suspect" variables.

Problems with this approach:


... Examining Program State33/67

An alternative for obtaining access to program state:

This is precisely what debuggers such as gdb provide.

Debuggers also allow you to inspect the state of a program that has crashed due to a run-time error.

Often, this takes you straight to the point where the bug occurred.


C Program Execution34/67

Under Unix, a C program executes either:

Normal C execution environment:

[Diagram:Pic/unixproc-small.png]


... C Program Execution35/67

C execution environment with a debugger:

 

[Diagram:Pic/gdbproc-small.png]


Debuggers36/67

A debugger gives control of program execution:

gdb command is a command-line-based debugger for C, C++ ...

There are GUI front-ends available (e.g. xxgdb, ddd, ...).


... Debuggers37/67

For gdb, programs must be compiled with the gcc -g flag.

gdb takes two command line arguments:

$ gdb executable core

E.g.

$ gdb  a.out  core
$ gdb  myprog

The core argument is optional.


gdb Sessions38/67

gdb is like a shell to control and monitor an executing C program.

Example session:

$ gcc -g -Wall -Werror -o prog prog.c
$ gdb prog
Copyright (C) 2014 Free Software Foundation, Inc...
(gdb) break 9
Breakpoint 1 at 0x100f03: file prog.c, line 9.
(gdb) run
/Users/comp1921 Starting program: ..../prog 

Breakpoint 1, main (argc=1, argv=0x7ffbc8) at prog.c:9
9               for (i = 1; i <= 3; i++ { 
(gdb) next
10              sum += a[i];
(gdb) print sum
$1 = 0
(gdb) print a[i]
$2 = 4
(gdb) print i
$3 = 1
(gdb) print a@1
$4 = {{7, 4, 3}}
(gdb) cont
...


Basic gdb Commands39/67

quit -- quits from gdb

help [CMD] -- on-line help (gives information about CMD command)

run ARGS -- run the program


gdb Status Commands40/67

where -- find which function the program was executing when it crashed.

list [LINE] -- display five lines either side of current statement.

print EXPR -- display expression values


gdb Execution Commands41/67

break [PROC|LINE] -- set break-point

On entry to procedure PROC (or reaching line LINE), stop execution and return control to gdb.

next -- single step (over procedures)

Execute next statement; if statement is a procedure call, execute entire procedure body.

step -- single step (into procedures)

Execute next statement; if statement is a procedure call, go to first statement in procedure body.

For more details see gdb's on-line help.


Using a Debugger42/67

The most common time to invoke a debugger is after a run-time error.

If this produces a core file, start gdb ...

Note, however, that the program may crash well after the bug ...


... Using a Debugger43/67

Once you have an idea where the bug might be:

This will eventually reveal a variable which has an incorrect value.


... Using a Debugger44/67

Once you find that the value of a given variable (e.g. x) is wrong, the next step is to determine why it is wrong.

There are two possibilities:

Example:

if (c > 0) { x = a+b; }

If we know that

Then we need to find out where a, b and c were set.


Laws of Debugging45/67

Courtesy of Zoltan Somogyi, Melbourne University

Before you can fix it, you must be able to break it (consistently).
(non-reproducible bugs ... Heisenbugs ... are extremely difficult to deal with)

If you can't find a bug where you're looking, you're looking in the wrong place.
(taking a break and resuming the debugging task later is generally a good idea)

It takes two people to find a subtle bug, but only one of them needs to know the program.
(the second person simply asks questions to challenge the debugger's assumptions)

(In fact, sometimes the second person doesn't have to do or say anything! The process of explaining the problem is often enough to trigger a Eureka event.)


Possibly Untrue Assumptions46/67

Debugging can be extremely frustrating when you make assumptions about the problem which turn out to be wrong.

Some things to be wary of:


Performance Tuning


Software Development Process48/67

Reminder of how software development runs ...

Typically, iterate over the implementation/testing phases.


Performance49/67

Why do we care about performance?

Good performance less hardware, happy users.

Bad performance more hardware, unhappy users.

Generally:   performance = execution time

Other measures: memory/disk space, network traffic, disk i/o.

Execution time can be measured in two ways:


... Performance50/67

In the past, performance was a significant problem.

Unfortunately, there is usually a trade-off between ...


Knuth: "Premature optimization is the root of all evil"


Development Strategy51/67

A pragmatic approach to efficiency:

Points to note: Pike: "A fast program that gets the wrong answer saves no time."


... Development Strategy52/67

Strategy for developing efficient programs:

  1. Design the program well
  2. Implement the program well
  3. Test the program well
  4. Only after you're sure it's working, measure performance
  5. If (and only if) performance is inadequate, find the "hot spots"
  6. Tune the code to fix these
  7. Repeat measure-analyse-tune cycle until performance ok


Exercise: Prime Number Tester53/67

Consider a function to test numbers for primality.

An integer n is prime if it has no divisors except 1 and n

Straightforward, literal, C implementation:

int is_prime(int n) {
	int i, ndiv = 0;
	for (i = 1; i <= n; i++) {
		if (n % i == 0) {
			ndiv++;
		}
	}
	return (ndiv == 2);
}

Implement, test, and examine performance. Tune, if necessary.


When to Tune?54/67

We should only consider tuning the performance of a program:

Pike's Guideline:

the time spent making a program faster
should not be more than the time the speedup will save
during the lifetime of the program


Performance Analysis55/67

Before we can tune the performance of a program

Performance analysis can be performed at various levels of detail: A Unix tool for performance analysis:


Benchmarks56/67

A benchmark is

E.g. sorting benchmark

Data Random  Sorted  Reverse
small (~10) ?? ?? ??
medium (~103) ?? ?? ??
large (~106) ?? ?? ??

Could potentially use an extension of the cases developed for testing the program.


... Benchmarks57/67

Benchmark caveats:

Benchmarks are not useful as a basis for performance tuning


The time Command58/67

The time command:

What resources it measures:


... The time Command59/67

What other resources it measures:

Note: not all systems measure all resource usages.


... The time Command60/67

Things to note when interpreting time output:


Exercise: Word Frequencies61/67

Consider a program wfreq to process text files (via stdin):

Use the /usr/bin/time command to measure execution cost of wfreq Determine the approximate cost per byte of input.


Program Execution62/67

Observation shows that most programs behave as follows:

This is often quoted as the "90/10 rule"   (or "80/20 rule")

This means that

Concentrate efforts at tuning in the heavily-used code.

(Sometimes this require us to change the code that invokes the heavily-used code)


Performance Improvement63/67

Once you have identified which region of code is "hot", can improve the performance of this code:


Efficiency Tricks64/67

Avoid unnecessary repeated evaluation ...

Compilers detect straight-forward examples but may not handle some examples obvious to humans.

for (i = 1; i <= N; i++) {
	x += f(y);
}

becomes

res = f(y);
for (i = 1; i <= N; i++) {
	x += res;
}


... Efficiency Tricks65/67

Use local variables instead of global variables ...

May enable compiler to optimize some operations.

i = k + 1;
i = i - j;
x = a[i];

If i is local then compiler may generate:

x = a[k+1-j];


... Efficiency Tricks66/67

Caching

Buffering Separate out special cases Use data instead of code


Tips for Quiz 1 (during Week-5 lab) 67/67


Produced: 11 Aug 2016