Week 11

 Problem-Solving Strategies 1/31

Five basic problem-solving strategies:

1. solution by evolution: adapt a known method
2. divide and conquer: solve partial problem, then extend
3. generate and test: make possible solutions, test them
4. approximation
5. simulation

 Generate and Test

 Generate and Test 3/31

In scenarios where

• it is simple to test whether a given state is a solution
• it is easy to generate new states   (preferably likely solutions)
then a generate and test strategy can be used.

It is necessary that states are generated systematically

• so that we are guaranteed to be approaching a solution
Simply generating random states and testing them ...
• may take a very long time to find a solution   (or may never find one)

 ... Generate and Test 4/31

Simple example: checking whether an integer n is prime

• generate/test all possible factors of n
• if none of them pass the test, then n is prime
Generation is straightforward:
• produce a sequence of all numbers from 2 to n-1
Testing is also straightfoward:
• check whether next number divides n exactly

 ... Generate and Test 5/31

Function for primality checking:

```// check whether n is a prime number
int isPrime(int n) {
int retval = 1;
int i; // next number to be tested as possible divisor

for (i = 2; i < n; i++) {  // generate
if (n % i == 0) {  // test
// i is a divisor => n is not prime
retval = 0;
}
}
return retval;
}
```

Can be optimised: end loop after divisor found, change  `(i < n)`  to  `(i*i <= n)`

 Exercise: Primes 6/31

Given a function to test for primality:

```int isPrime(int n);
```

write a function to find the smallest prime larger than a given number:

```int nextPrime(int n) { ... }
```

Then write a program to produce all prime numbers starting from 1.

 Example: Subset Sum 7/31

Problem to solve ...

Is there a subset S of these numbers with sum(S)=1000?

``` 34,  38,  39,  43,  55,  66,  67,  84,  85,  91,
101, 117, 128, 138, 165, 168, 169, 182, 184, 186,
234, 238, 241, 276, 279, 288, 386, 387, 388, 389
```

General problem:

• given n integers and a target sum k
• is there a subset that adds up to exactly k?
What strategy might we use?

 ... Example: Subset Sum 8/31

Simple generate and test approach:

```A: set of n distinct integers

for each subset S of A {
if (sum(S) == k)
return YES
}
return NO
```

How many subsets are there of n elements?

How could we generate them?

 Exercise: Generate Subsets 9/31

Devise a method to generate subsets

• given: a set of `n` distinct integers in an array `A`
• produces: all subsets of these integers
• where each subset is stored in an array of length ≤n
Hints:
• represent sets as n bits   (e.g. n=4, `0000`, `1010`, `1111` etc.)
• bit i represents the i th input number
• if bit i is set to 1, then `A[i]` is in the subset
• if bit i is set to 0, then `A[i]` is not in the subset
• e.g. if `A[]=={1,2,3,5}` then `1010` represents `{2,5}`

 Sidetrack: Bit Operators 10/31

C can treat basic data types as bit-strings   (`unsigned int`)

E.g. `'a' == 0x61 == 0110 0001`
E.g. `4999 == 0x00001387 == 0001 0011 1000 0111`

Operations on bits:

• bitwise AND ... `1&1 == 1, 0&1 == 0, 0&0 == 0`
• bitwise OR ... `1|1 == 1, 0|1 == 1, 0|0 == 0`
• bitwise XOR ... `1^1 == 0, 0^1 == 1, 0^0 == 0`
• bitwise NOT ... `~1 == 0, ~0 == 1`

 ... Sidetrack: Bit Operators 11/31

More bit operations:

• left shift ... `0x01<<2 == 0x04, 0x07<<4 == 0x70`
• right shift ... `0x04>>2 == 0x01, 0x56>>4 == 0x05`

 Exercise: Find Subsets with Sum k 12/31

Extend the subset generator

• takes the sum value k as command-line argument
• reads set values from standard input
• prints all subsets that sum to k
To solve our original problem:
• print YES as soon as we find a solution
• print NO after all possible solutions considered
Are there any problems with this approach?

 ... Exercise: Find Subsets with Sum k 13/31

Alternative approach ...

• `int subsetsum(int A[], int n, int k)`
(returns 1 if any subset of `A[0..n-1]` sums to `k`; returns 0 otherwise)
• easy cases: `k==0` (solved by {}), `n==0` (no elements)
• consider the last value `A[n-1]`, call it `m`
• assume that there is a solution   (i.e. subset S where sum(S)=k)
• if `m` is part of a solution ...
• then the first n-1 values must sum to `k-m`
• if `m` is not part of a solution ...
• then the first n-1 values must contain a solution

 ... Exercise: Find Subsets with Sum k 14/31

Leads to the following divide-and-conquer solution:

```int subsetsum(int A[], int n, int k) {
int retval;
if (k == 0)
retval = 1;   // empty set solves this
else if (n == 0)
retval = 0;   // no elements => no sums
else {
// use considerations from previous page
int m = A[n-1];
retval = (subsetsum(A, n-1, k-m) || subsetsum(A, n-1, k));
}
return retval;
}
```

• Measure the performance of `subsetsum` using the UNIX `time` command

 ... Exercise: Find Subsets with Sum k 15/31

Subset Sum is typical of a class of exponential algorithms

• these have exponential performance
• execution time is proportional to 2 to the power of the input N
• N+1, double the time
• N+3, increase the time by factor 8
• increase N by 10, it takes 210 = 1024 times as long to execute
• increase N by 100, it takes 2100 = 1,267,650,600,228,229,401,496,703,205,376 times as long to execute
• etc
This is a lot slower than quadratic algorithms (e.g. sorting in the worst case)

 Approximation

 Problem-Solving Strategies 17/31

Five basic problem-solving strategies:

1. solution by evolution   (adapt a known method)
2. divide and conquer   (solve partial problem, then extend)
3. generate and test   (make possible solutions, test them)
4. approximation   (solve a close, but easier, problem)
5. simulation   (create an executable model)

 Approximation 18/31

Approximation is often used to solve numerical problems

• length of a curve determined by a function f
• area under a curve for a function f
• roots of a function f
Essence of approximation:
• solve a simpler, but much more easily solved, problem
• where this new problem gives an approximate solution
• and refine the method until it is "accurate enough"

 Exercise: Length of a Curve 19/31

Estimate length: approximate curve as sequence of straight lines. Approximate curve length by sum of line lengths

• using a function interface:

```
double curveLength(double start, double end, double (*f)(double))
```

 ... Exercise: Length of a Curve 20/31

• large step size ...
• less steps, less computation (faster), lower accuracy
• small step size ...
• more steps, more computation (slower), higher accuracy
However, too many steps may lead to higher rounding error.

Each f has an optimal step size ...

• but this is difficult to determine in advance

 ... Exercise: Length of a Curve 21/31

```length = curveLength(0, M_PI, sin);
```

Convergence when using more and more steps

```steps =       0, length = 0.000000
steps =      10, length = 3.815283
steps =     100, length = 3.820149
steps =    1000, length = 3.820197
steps =   10000, length = 3.819753
steps =  100000, length = 3.820198
steps = 1000000, length = 3.820198
```

 Example: Finding Roots 22/31

Find where a function crosses the x-axis: Generate and test: move x1 and x2 together until "close enough"

 Simulation

 Problem-Solving Strategies 24/31

Five basic problem-solving strategies:

1. solution by evolution   (adapt a known method)
2. divide and conquer   (solve partial problem, then extend)
3. generate and test   (make possible solutions, test them)
4. approximation   (solve a close, but easier, problem)
5. simulation   (create an executable model)

 Simulation 25/31

In some problem scenarios

• it is difficult to devise an analytical solution
• so build a software model and run experiments
Examples: weather forecasting, traffic flow, queueing, games

Such systems typically require random number generation

• distributions: uniform, numerical, normal, exponential
Accuracy of results depends on accuracy of model.

 Example: Gambling Game 26/31

Consider the following game:

• you bet \$1 and roll two dice (6-sided)
• if total is between 8 and 11, you get \$2 back
• if total is 12, you get \$6 back
• otherwise, you lose your money
Is this game worth playing?

Test: start with \$5 and play until you have \$0 or \$20.

In fact, this example is reasonably easy to solve analytically.

 ... Example: Gambling Game 27/31

We can get a reasonable approximation by simulation

• set our initial balance to 5.00
• generate two random numbers in range 1..6 (dice)
• adjust balance by payout or loss
• repeat above until balance ≤ 0 or balance ≥ 20
• run a very large number of trials like the above
• collect statistics on the outcome

 Exercise: Testing the Gambling Game 28/31

Implement a program to test the Gambling Game:

• runs 10000 trials of \$5 start until bust or \$20
• count the length of each game, then average
• count the number of wins and losses
Reminder of the rules:
• pay \$1, roll two 6-sided dice and add up values
• if 2 ≤ total ≤ 7, lose money
• if 8 ≤ total ≤ 11, win \$2
• if total = 12, win \$6

 Example: Area inside a Curve 29/31

Scenario:

• have a closed curve defined by a complex function
• have a function to compute "X is inside/outside curve?" ... Example: Area inside a Curve 30/31

Simulation approach to determining the area:

• determine a region completely enclosing curve
• generate very many random points in this region
• for each point x, compute inside(x)
• count number of insides and outsides
• areaWithinCurve = totalArea * insides/(insides+outsides)
I.e. we approximate the area within the curve by using the ratio of points inside the curve against those outside

Also known as Monte Carlo estimation

 Tips for Last Lab 31/31

Course Revision

Three challenging exercises on the focal topic of this course: advanced data structures

• At the end of this course, you have become competent programmers and don't require tips anymore.

Last tutorial is practice for Theory Part of exam

Produced: 12 Oct 2016