Download - Chapter 5 Greedy Algorithms
Chapter 5
Greedy Algorithms
Optimization Problems
Optimization problem: a problem of finding the best solution from all feasible solutions.
Two common techniques: Greedy Algorithms (local) Dynamic Programming (global)
Greedy Algorithms
Greedy algorithms typically consist of
A set of candidate solutions Function that checks if the candidates are
feasible Selection function indicating at a given
time which is the most promising candidate not yet used
Objective function giving the value of a solution; this is the function we are trying to optimize
Step by Step Approach
Initially, the set of chosen candidates is empty
At each step, add to this set the best remaining candidate; this is guided by selection function.
If increased set is no longer feasible, then remove the candidate just added; else it stays.
Each time the set of chosen candidates is increased, check whether the current set now constitutes a solution to the problem.
When a greedy algorithm works correctly, the first solution found in this way is always optimal.
Examples of Greedy Algorithms
Graph Algorithms Breath First Search (shortest path 4 un-
weighted graph) Dijkstra’s (shortest path) Algorithm Minimum Spanning Trees
Data compression Huffman coding
Scheduling Activity Selection Minimizing time in system Deadline scheduling
Other Heuristics Coloring a graph Traveling Salesman Set-covering
Elements of Greedy Strategy
Greedy-choice property: A global optimal solution can be arrived at by making locally optimal (greedy) choices
Optimal substructure: an optimal solution to the problem contains within it optimal solutions to sub-problems Be able to demonstrate that if A is an optimal
solution containing s1, then the set A’ = A - {s1} is an optimal solution to a smaller problem w/o s1.
Analysis The selection function is usually based on
the objective function; they may be identical. But, often there are several plausible ones.
At every step, the procedure chooses the best candidate, without worrying about the future. It never changes its mind: once a candidate is included in the solution, it is there for good; once a candidate is excluded, it’s never considered again.
Greedy algorithms do NOT always yield optimal solutions, but for many problems they do.
Huffman Coding
Huffman codes –- very effective technique for
compressing data, saving 20% - 90%.
Coding
Problem: Consider a data file of 100,000
characters You can safely assume that there are
many a,e,i,o,u, blanks, newlines, few q, x, z’s
Want to store it compactly
Solution: Fixed-length code, ex. ASCII, 8 bits per
character Variable length code, Huffman code (Can take advantage of relative freq of letters to save space)
Example
2Z
Total BitsCodeFrequency
Char
7K
24M
32C
37U
42D
42L
111
110
101
100
011
010
001
000120E
• Fixed-length code, need ? bits for each char
6
918
21
72
96
111
126
126
360
3
Example (cont.)
37L:42 U:37 C:32 M:24 K:7 Z:2D:42E:120
0
0
0 0
0
1
1
1
1
1
1
1
0
0
E L D U C M K Z
000 001 010 011 100 101 110 111
CharCode
Complete binary tree
Example (cont.)
Variable length code (Can take advantage of relative freq of letters
to save space)
- Huffman codesE L D U C M K ZChar
Code
Huffman Tree Construction (1)
1. Associate each char with weight (= frequency) to form a subtree of one node (char, weight)
2. Group all subtrees to form a forest 3. Sort subtrees by ascending weight of
subroots4. Merge the first two subtrees (ones with
lowest weights)5. Assign weight of subroot with sum of
weights of two children.6. Repeat 3,4,5 until only one tree in the
forest
Huffman Tree Construction (2)
Huffman Tree Construction (3)
M
Assigning Codes
Compare with: 918~15% less
111100
100
110
111101
11111
0
101
1110
BitsCode
2Z
37U
42L
7K
24M
120E
42D
32C
Freq
char
12
785
111
126
42
120
120
126
128
Huffman Coding Tree
Coding and Decoding
DEED:
MUCK:
E L D U C M K Z
000 001 010 011 100 101 110 111CharCode
E L D U C M K Z
0 110 101 100 1110 11111 111101 111100CharCode
DEED: MUCK:
010000000010
101011100110
10100101
111111001110111101
Prefix codes
A set of codes is said to meet the prefix property if no code in the set is the prefix of another. Such codes are called prefix codes.
Huffman codes are prefix codes.
E L D U C M K Z0 110 101 100 1110 11111 111101 111100
CharCode
Coin Changing
Coin Changing Goal. Given currency denominations: 1, 5, 10,
25, 100, devise a method to pay amount to customer using fewest number of coins.
Ex: 34¢.
Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid.
Ex: $2.89.
Coin-Changing: Greedy Algorithm
Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid.
Q. Is cashier's algorithm optimal?
Sort coins denominations by value: c1 < c2 < … < cn.
S while (x 0) { let k be largest integer such that ck x if (k = 0) return "no solution found" x x - ck
S S {k}}return S
coins selected
Coin-Changing: Analysis of Greedy Algorithm
Observation. Greedy algorithm is sub-optimal for US postal denominations: 1, 10, 21, 34, 37, 44, 70, 100, 350, 1225, 1500.
Counterexample. 140¢. Greedy: 100, 37, 1, 1, 1. Optimal: 70, 70.
Greedy algorithm failed!
Coin-Changing: Analysis of Greedy Algorithm
Theorem. Greed is optimal for U.S. coinage: 1, 5, 10, 25, 100.
Proof. (by induction on x) Let ck be the kth smallest coin Consider optimal way to change ck x < ck+1 : greedy takes
coin k. We claim that any optimal solution must also take coin k.
if not, it needs enough coins of type c1, …, ck-1 to add up to x table below indicates no optimal solution can do this
Problem reduces to coin-changing x - ck cents, which, by induction, is optimally solved by greedy algorithm.
1
ck
10
25
100
P 4
All optimal solutionsmust satisfy
N + D 2
Q 3
5 N 1
no limit
k
1
3
4
5
2
-
Max value of coins1, 2, …, k-1 in any OPT
4 + 5 = 9
20 + 4 = 24
4
75 + 24 = 99
Coin-Changing: Analysis of Greedy Algorithm
Theorem. Greed is optimal for U.S. coinage: 1, 5, 10, 25, 100. Consider optimal way to change ck x < ck+1 : greedy takes
coin k. We claim that any optimal solution must also take coin k.
1
ck
10
25
100
P 4
All optimal solutionsmust satisfy
N + D 2
Q 3
5 N 1
no limit
k
1
3
4
5
2
-
Max value of coins1, 2, …, k-1 in any OPT
4 + 5 = 9
20 + 4 = 24
4
75 + 24 = 99
1
ck
10
25
100
P 9
All optimal solutionsmust satisfy
P + D 8
Q 3
no limit
k
1
2
3
4
-
Max value of coins1, 2, …, k-1 in any OPT
9
40 + 4 = 44
75 + 44 = 119
Kevin’s problem
Activity-selection Problem
Activity-selection Problem
Input: Set S of n activities, a1, a2, …, an. si = start time of activity i. fi = finish time of activity i.
Output: Subset A of max # of compatible activities. Two activities are compatible, if their intervals
don’t overlap.
Time0 1 2 3 4 5 6 7 8 9 1
011
f
g
h
e
a
b
c
d
Interval Scheduling: Greedy Algorithms
Greedy template. Consider jobs in some order. Take each job provided it's compatible with the ones already taken.
Earliest start time: Consider jobs in ascending order of start time sj.
Earliest finish time: Consider jobs in ascending order of finish time fj.
Shortest interval: Consider jobs in ascending order of interval length fj - sj.
Fewest conflicts: For each job, count the number of conflicting jobs cj. Schedule in ascending order of conflicts cj.
…
Greedy algorithm. Consider jobs in increasing order of finish time. Take each job provided it's compatible with the ones already taken.
Implementation. O(n log n). Remember job j* that was added last to A. Job j is compatible with A if sj fj*.
Interval Scheduling: Greedy Algorithm
Sort jobs by finish times so that f1 f2 ... fn.
A for j = 1 to n { if (job j compatible with A) A A {j}}return A
jobs selected
Interval Scheduling: Analysis
Theorem. Greedy algorithm is optimal.Proof: (by contradiction)
Assume greedy is not optimal, and let's see what happens.
Let i1, ... ik denote set of jobs selected by greedy. Let j1, ... jm denote set of jobs in the optimal
solution withi1 = j1, i2 = j2, ..., ir = jr for the largest possible value of r.
j1 j2 jr
i1 i1 ir ir+1
. . .
Greedy:
OPT: jr+1
why not replace job jr+1
with job ir+1?
job ir+1 finishes before jr+1
ir+1
Still optimal with a bigger value than r : ir+1=jr+1
contradiction!
Weighted Interval Scheduling
Weighted interval scheduling problem. Job j starts at sj, finishes at fj, and has weight
or value vj . Two jobs compatible if they don't overlap. Goal: find maximum weight subset of
mutually compatible jobs.
Time0 1 2 3 4 5 6 7 8 9 1
011
f
g
h
e
a
b
c
d
Greedy algorithm?
Weighted Interval Scheduling
Cost?
Set Covering - one of Karp's 21 NP-complete
problems Given:
a set of elements B a set S of n sets {Si}
whose union equals the universe B
Output: cover of B A subset of S whose
union = B Cost:
Number of sets picked Goal:
Minimum cost cover
How many Walmart centers should Walmart build in
Ohio?
For each town t, St = {towns that are within 30 miles of it} -- a Walmart center at t will cover all towns in St .
Set Covering - Greedy approach
while (not all covered)Pick Si with largest uncovered elements
Proof: Let nt be # of elements not covered after t iterations. There must be a set with ≥ nt/k elements:
nt+1 ≤ nt - nt/k ≤ n0(1-1/k)t+1 ≤ n0(e-1/k)t+1
nt < 1 when t=klnn.
Dynamic programming