Week 11
Problem-Solving Strategies | 1/31 |
Five basic problem-solving strategies:
- solution by evolution: adapt a known method
- divide and conquer: solve partial problem, then extend
- generate and test: make possible solutions, test them
- approximation
- simulation
In scenarios where
- it is simple to test whether a given state is a solution
- it is easy to generate new states
(preferably likely solutions)
then a generate and test strategy can be used.
It is necessary that states are generated systematically
- so that we are guaranteed to be approaching a solution
Simply generating random states and testing them ...
- may take a very long time to find a solution
(or may never find one)
... Generate and Test | 4/31 |
Simple example: checking whether an integer n is prime
- generate/test all possible factors of n
- if none of them pass the test, then n is prime
Generation is straightforward:
- produce a sequence of all numbers from 2 to n-1
Testing is also straightfoward:
- check whether next number divides n exactly
... Generate and Test | 5/31 |
Function for primality checking:
// check whether n is a prime number
int isPrime(int n) {
int retval = 1;
int i; // next number to be tested as possible divisor
for (i = 2; i < n; i++) { // generate
if (n % i == 0) { // test
// i is a divisor => n is not prime
retval = 0;
}
}
return retval;
}
Can be optimised: end loop after divisor found, change (i < n)
to (i*i <= n)
Given a function to test for primality:
int isPrime(int n);
write a function to find the smallest prime larger than a given number:
int nextPrime(int n) { ... }
Then write a program to produce all prime numbers starting from 1.
Problem to solve ...
Is there a subset S of these numbers with sum(S)=1000?
34, 38, 39, 43, 55, 66, 67, 84, 85, 91,
101, 117, 128, 138, 165, 168, 169, 182, 184, 186,
234, 238, 241, 276, 279, 288, 386, 387, 388, 389
General problem:
- given n integers and a target sum k
- is there a subset that adds up to exactly k?
What strategy might we use?
... Example: Subset Sum | 8/31 |
Simple generate and test approach:
A: set of n distinct integers
for each subset S of A {
if (sum(S) == k)
return YES
}
return NO
How many subsets are there of n elements?
How could we generate them?
Exercise: Generate Subsets | 9/31 |
Devise a method to generate subsets
- given: a set of
n
distinct integers in an array A
- produces: all subsets of these integers
- where each subset is stored in an array of length ≤n
Hints:
- represent sets as n bits
(e.g. n=4,
0000
, 1010
, 1111
etc.)
- bit i represents the i th input number
- if bit i is set to 1, then
A[i]
is in the subset
- if bit i is set to 0, then
A[i]
is not in the subset
- e.g. if
A[]=={1,2,3,5}
then 1010
represents {2,5}
Sidetrack: Bit Operators | 10/31 |
C can treat basic data types as bit-strings
(unsigned int
)
E.g. 'a' == 0x61 == 0110 0001
E.g. 4999 == 0x00001387 == 0001 0011 1000 0111
Operations on bits:
- bitwise AND ...
1&1 == 1, 0&1 == 0, 0&0 == 0
- bitwise OR ...
1|1 == 1, 0|1 == 1, 0|0 == 0
- bitwise XOR ...
1^1 == 0, 0^1 == 1, 0^0 == 0
- bitwise NOT ...
~1 == 0, ~0 == 1
... Sidetrack: Bit Operators | 11/31 |
More bit operations:
- left shift ...
0x01<<2 == 0x04, 0x07<<4 == 0x70
- right shift ...
0x04>>2 == 0x01, 0x56>>4 == 0x05
Exercise: Find Subsets with Sum k | 12/31 |
Extend the subset generator
- takes the sum value k as command-line argument
- reads set values from standard input
- prints all subsets that sum to k
To solve our original problem:
- print YES as soon as we find a solution
- print NO after all possible solutions considered
Are there any problems with this approach?
... Exercise: Find Subsets with Sum k | 13/31 |
Alternative approach ...
-
int subsetsum(int A[], int n, int k)
(returns 1 if any subset of A[0..n-1]
sums to k
; returns 0 otherwise)
- easy cases:
k==0
(solved by {}), n==0
(no elements)
- consider the last value
A[n-1]
, call it m
- assume that there is a solution
(i.e. subset S where sum(S)=k)
- if
m
is part of a solution ...
- then the first n-1 values must sum to
k-m
- if
m
is not part of a solution ...
- then the first n-1 values must contain a solution
... Exercise: Find Subsets with Sum k | 14/31 |
Leads to the following divide-and-conquer solution:
int subsetsum(int A[], int n, int k) {
int retval;
if (k == 0)
retval = 1; // empty set solves this
else if (n == 0)
retval = 0; // no elements => no sums
else {
// use considerations from previous page
int m = A[n-1];
retval = (subsetsum(A, n-1, k-m) || subsetsum(A, n-1, k));
}
return retval;
}
- Measure the performance of
subsetsum
using the UNIX time
command
... Exercise: Find Subsets with Sum k | 15/31 |
Subset Sum is typical of a class of exponential algorithms
- these have exponential performance
- execution time is proportional to 2 to the power of the input N
- N+1, double the time
- N+2, quadruple the time
- N+3, increase the time by factor 8
- increase N by 10, it takes 210 = 1024 times as long to execute
- increase N by 100, it takes 2100 = 1,267,650,600,228,229,401,496,703,205,376 times as long to execute
- etc
This is a lot slower than quadratic algorithms (e.g. sorting in the worst case)
Problem-Solving Strategies | 17/31 |
Five basic problem-solving strategies:
- solution by evolution (adapt a known method)
- divide and conquer (solve partial problem, then extend)
- generate and test (make possible solutions, test them)
- approximation (solve a close, but easier, problem)
- simulation (create an executable model)
Approximation is often used to solve numerical problems
- length of a curve determined by a function f
- area under a curve for a function f
- roots of a function f
Essence of approximation:
- solve a simpler, but much more easily solved, problem
- where this new problem gives an approximate solution
- and refine the method until it is "accurate enough"
Exercise: Length of a Curve | 19/31 |
Estimate length: approximate curve as sequence of straight lines.
Approximate curve length by sum of line lengths
... Exercise: Length of a Curve | 20/31 |
Trade-offs in this method:
- large step size ...
- less steps, less computation (faster), lower accuracy
- small step size ...
- more steps, more computation (slower), higher accuracy
However, too many steps may lead to higher rounding error.
Each f has an optimal step size ...
- but this is difficult to determine in advance
... Exercise: Length of a Curve | 21/31 |
length = curveLength(0, M_PI, sin);
Convergence when using more and more steps
steps = 0, length = 0.000000
steps = 10, length = 3.815283
steps = 100, length = 3.820149
steps = 1000, length = 3.820197
steps = 10000, length = 3.819753
steps = 100000, length = 3.820198
steps = 1000000, length = 3.820198
Actual answer is 3.820197789...
Example: Finding Roots | 22/31 |
Find where a function crosses the x-axis:
Generate and test: move x1 and x2 together until "close enough"
Problem-Solving Strategies | 24/31 |
Five basic problem-solving strategies:
- solution by evolution (adapt a known method)
- divide and conquer (solve partial problem, then extend)
- generate and test (make possible solutions, test them)
- approximation (solve a close, but easier, problem)
- simulation (create an executable model)
In some problem scenarios
- it is difficult to devise an analytical solution
- so build a software model and run experiments
Examples: weather forecasting, traffic flow, queueing, games
Such systems typically require random number generation
- distributions: uniform, numerical, normal, exponential
Accuracy of results depends on accuracy of model.
Example: Gambling Game | 26/31 |
Consider the following game:
- you bet $1 and roll two dice (6-sided)
- if total is between 8 and 11, you get $2 back
- if total is 12, you get $6 back
- otherwise, you lose your money
Is this game worth playing?
Test: start with $5 and play until you have $0 or $20.
In fact, this example is reasonably easy to solve analytically.
... Example: Gambling Game | 27/31 |
We can get a reasonable approximation by simulation
- set our initial balance to 5.00
- generate two random numbers in range 1..6 (dice)
- adjust balance by payout or loss
- repeat above until balance ≤ 0 or balance ≥ 20
- run a very large number of trials like the above
- collect statistics on the outcome
Exercise: Testing the Gambling Game | 28/31 |
Implement a program to test the Gambling Game:
- runs 10000 trials of $5 start until bust or $20
- count the length of each game, then average
- count the number of wins and losses
Reminder of the rules:
- pay $1, roll two 6-sided dice and add up values
- if 2 ≤ total ≤ 7, lose money
- if 8 ≤ total ≤ 11, win $2
- if total = 12, win $6
Example: Area inside a Curve | 29/31 |
Scenario:
- have a closed curve defined by a complex function
- have a function to compute "X is inside/outside curve?"
... Example: Area inside a Curve | 30/31 |
Simulation approach to determining the area:
- determine a region completely enclosing curve
- generate very many random points in this region
- for each point x, compute inside(x)
- count number of insides and outsides
- areaWithinCurve = totalArea * insides/(insides+outsides)
I.e. we approximate the area within the curve by using
the ratio of points inside the curve against those outside
Also known as Monte Carlo estimation
Course Revision
Three challenging exercises on the focal topic of this course: advanced data structures
- At the end of this course, you have become competent programmers and don't require tips anymore.
Last tutorial is practice for Theory Part of exam
Produced: 12 Oct 2016