greedy algorithms david kauchak cs161 summer 2009
TRANSCRIPT
![Page 1: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/1.jpg)
Greedy algorithms
David Kauchak
cs161
Summer 2009
![Page 2: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/2.jpg)
Administrative
Thank your TAs
![Page 3: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/3.jpg)
Kruskal’s and Prim’s
Make greedy decision about the next edge to add
![Page 4: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/4.jpg)
Greedy algorithm structure
A locally optimal decision can be made which results in a subproblem that does not rely on the local decision
![Page 5: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/5.jpg)
Interval scheduling
Given n activities A = [a1,a2, .., an] where each activity has start time si and a finish time fi. Schedule as many as possible of these activities such that they don’t conflict.
![Page 6: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/6.jpg)
Interval scheduling
Given n activities A = [a1,a2, .., an] where each activity has start time si and a finish time fi. Schedule as many as possible of these activities such that they don’t conflict.
Which activities conflict?
![Page 7: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/7.jpg)
Interval scheduling
Given n activities A = [a1,a2, .., an] where each activity has start time si and a finish time fi. Schedule as many as possible of these activities such that they don’t conflict.
Which activities conflict?
![Page 8: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/8.jpg)
Simple recursive solution
Enumerate all possible solutions and find which schedules the most activities
![Page 9: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/9.jpg)
Simple recursive solution
Is it correct? max{all possible solutions}
Running time? O(n!)
![Page 10: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/10.jpg)
Can we do better?
Dynamic programming O(n2)
Greedy solution – Is there a way to make a local decision?
![Page 11: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/11.jpg)
Overview of a greedy approach
Greedily pick an activity to schedule Add that activity to the answer Remove that activity and all conflicting
activities. Call this A’. Repeat on A’ until A’ is empty
![Page 12: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/12.jpg)
Greedy options
Select the activity that starts the earliest, i.e. argmin{s1, s2, s3, …, sn}?
![Page 13: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/13.jpg)
Greedy options
Select the activity that starts the earliest?
non-optimal
![Page 14: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/14.jpg)
Greedy options
Select the shortest activity, i.e. argmin{f1-s1, f2-s2, f3-s3, …, fn-sn}
![Page 15: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/15.jpg)
Greedy options
Select the shortest activity, i.e. argmin{f1-s1, f2-s2, f3-s3, …, fn-sn}
non-optimal
![Page 16: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/16.jpg)
Greedy options
Select the activity with the smallest number of conflicts
![Page 17: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/17.jpg)
Greedy options
Select the activity with the smallest number of conflicts
![Page 18: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/18.jpg)
Greedy options
Select the activity with the smallest number of conflicts
![Page 19: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/19.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 20: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/20.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 21: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/21.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 22: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/22.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 23: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/23.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 24: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/24.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 25: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/25.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 26: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/26.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 27: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/27.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 28: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/28.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
Multiple optimal solutions
![Page 29: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/29.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 30: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/30.jpg)
Greedy options
Select the activity that ends the earliest, i.e. argmin{f1, f2, f3, …, fn}?
![Page 31: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/31.jpg)
Efficient greedy algorithm
Once you’ve identified a reasonable greedy heuristic: Prove that it always gives the correct answer Develop an efficient solution
![Page 32: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/32.jpg)
Is our greedy approach correct?
“Stays ahead” argument: show that no matter what other solution someone provides you, the solution provided by your algorithm always “stays ahead”, in that no other choice could do better
![Page 33: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/33.jpg)
Is our greedy approach correct?
“Stays ahead” argument Let r1, r2, r3, …, rk be the solution found by our
approach
Let o1, o2, o3, …, ok of another optimal solution
Show our approach “stays ahead” of any other solution
…r1 r2 r3 rk
o1 o2 o3 ok
…
![Page 34: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/34.jpg)
Stays ahead
…r1 r2 r3 rk
o1 o2 o3 ok
…
Compare first activities of each solution
![Page 35: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/35.jpg)
Stays ahead
…r1 r2 r3 rk
o1 o2 o3 ok
…
finish(r1) ≤ finish(o1)
![Page 36: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/36.jpg)
Stays ahead
…r2 r3 rk
o2 o3 ok
…
We have at least as much time as any other solution to schedule the remaining 2…k tasks
![Page 37: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/37.jpg)
An efficient solution
![Page 38: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/38.jpg)
Running time?
Θ(n log n)
Θ(n)
Overall: Θ(n log n)Better than:
O(n!)O(n2)
![Page 39: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/39.jpg)
Scheduling all intervals
Given n activities, we need to schedule all activities. Minimize the number of resources required.
![Page 40: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/40.jpg)
Greedy approach?
The best we could ever do is the maximum number of conflicts for any time period
![Page 41: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/41.jpg)
Calculating max conflicts efficiently
3
![Page 42: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/42.jpg)
Calculating max conflicts efficiently
1
![Page 43: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/43.jpg)
Calculating max conflicts efficiently
3
![Page 44: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/44.jpg)
Calculating max conflicts efficiently
1
![Page 45: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/45.jpg)
Calculating max conflicts efficiently
…
![Page 46: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/46.jpg)
Calculating max conflicts
![Page 47: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/47.jpg)
Correctness?
We can do no better then the max number of conflicts. This exactly counts the max number of conflicts.
![Page 48: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/48.jpg)
Runtime?
O(2n log 2n + n) = O(n log n)
![Page 49: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/49.jpg)
Horn formulas
Horn formulas are a particular form of boolean logic formulas
They are one approach to allow a program to do logical reasoning
Boolean variables: represent some event x = the murder took place in the kitchen y = the butler is innocent z = the colonel was asleep at 8 pm
![Page 50: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/50.jpg)
Implications
Left-hand side is an AND of any number of positive literals
Right-hand side is a single literal
x = the murder took place in the kitcheny = the butler is innocentz = the colonel was asleep at 8 pm
If the colonel was asleep at 8 pm and the butler is innocent then the murder took place in the kitchen
xyz
![Page 51: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/51.jpg)
Implications
Left-hand side is an AND of any number of positive literals
Right-hand side is a single literal
x = the murder took place in the kitcheny = the butler is innocentz = the colonel was asleep at 8 pm
the murder took place in the kitchen
x
![Page 52: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/52.jpg)
Negative clauses
An OR of any number of negative literals
u = the constable is innocentt = the colonel is innocenty = the butler is innocent
ytu
not every one is innocent
![Page 53: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/53.jpg)
Goal
Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula)
x
y
zux
zyx
u x y z
0 1 1 0
![Page 54: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/54.jpg)
Goal
Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula)
x
y
zyx
zyx
u x y z
not satifiable
![Page 55: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/55.jpg)
Goal
Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula)
xyx
wzx yxw
?
wyx xzyw
![Page 56: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/56.jpg)
Goal
Given a horn formula (i.e. set of implications and negative clauses), determine if the formula is satisfiable (i.e. an assignment of true/false that is consistent with all of the formula)
zux
zyx
implications tell us to set some variables to true
negative clauses encourage us make them false
![Page 57: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/57.jpg)
A brute force solution
Try each setting of the boolean variables and see if any of them satisfy the formula
For n variables, how many settings are there? 2n
![Page 58: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/58.jpg)
A greedy solution?
xyx
wzx yxw wyx
xzyw
w 0
x 0
y 0
z 0
![Page 59: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/59.jpg)
A greedy solution?
xyx
wzx yxw wyx
xzyw
w 0
x 1
y 0
z 0
![Page 60: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/60.jpg)
A greedy solution?
xyx
wzx yxw wyx
xzyw
w 0
x 1
y 1
z 0
![Page 61: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/61.jpg)
A greedy solution?
xyx
wzx yxw wyx
xzyw
w 1
x 1
y 1
z 0
![Page 62: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/62.jpg)
A greedy solution?
xyx
wzx yxw wyx
xzyw
w 1
x 1
y 1
z 0
not satisfiable
![Page 63: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/63.jpg)
A greedy solution
![Page 64: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/64.jpg)
A greedy solution
set all variables of the implications of the form “x” to true
![Page 65: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/65.jpg)
A greedy solution
if the all variables of the lhs of an implication are true, then set the rhs variable to true
![Page 66: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/66.jpg)
A greedy solution
see if all of the negative clauses are satisfied
![Page 67: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/67.jpg)
Correctness of greedy solution
Two parts: If our algorithm returns an assignment, is it a valid
assignment? If our algorithm does not return an assignment,
does an assignment exist?
![Page 68: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/68.jpg)
Correctness of greedy solution
If our algorithm returns an assignment, is it a valid assignment?
![Page 69: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/69.jpg)
Correctness of greedy solution
If our algorithm returns an assignment, is it a valid assignment?
explicitly check all negative clauses
![Page 70: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/70.jpg)
Correctness of greedy solution
If our algorithm returns an assignment, is it a valid assignment?
don’t stop until all implications with all lhs elements true have rhs true
![Page 71: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/71.jpg)
Correctness of greedy solution
If our algorithm does not return an assignment, does an assignment exist?
Our algorithm is “stingy”. It only sets those variables that have to be true. All others remain false.
![Page 72: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/72.jpg)
Running time?
?
![Page 73: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/73.jpg)
Running time?
O(nm)
n = number of variables
m = number of formulas
![Page 74: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/74.jpg)
Knapsack problems: Greedy or not?
0-1 Knapsack – A thief robbing a store finds n items worth v1, v2, .., vn dollars and weight w1, w2, …, wn pounds, where vi and wi are integers. The thief can carry at most W pounds in the knapsack. Which items should the thief take if he wants to maximize value.
Fractional knapsack problem – Same as above, but the thief happens to be at the bulk section of the store and can carry fractional portions of the items. For example, the thief could take 20% of item i for a weight of 0.2wi and a value of 0.2vi.
![Page 75: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/75.jpg)
Data compression
Given a file containing some data of a fixed alphabet Σ (e.g. A, B, C, D), we would like to pick a binary character code that minimizes the number of bits required to represent the data.
A C A D A A D B … 0010100100100 …
minimize the size of the encoded file
![Page 76: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/76.jpg)
Compression algorithms
http://en.wikipedia.org/wiki/Data_compression
![Page 77: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/77.jpg)
Simplifying assumption over general compression?
Given a file containing some data of a fixed alphabet Σ (e.g. A, B, C, D), we would like to pick a binary character code that minimizes the number of bits required to represent the data.
A C A D A A D B … 0010100100100 …
minimize the size of the encoded file
![Page 78: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/78.jpg)
Frequency only
The problem formulation makes the simplifying assumption that we only have character frequency information for a file
A C A D A A D B …
=Symbol Frequency
A
B
C
D
70
3
20
37
![Page 79: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/79.jpg)
Fixed length code
Use ceil(log2|Σ|) bits for each character
A = B = C = D =
![Page 80: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/80.jpg)
Fixed length code
Use ceil(log2|Σ|) bits for each character
A = 00B = 01C = 10D = 11
Symbol Frequency
A
B
C
D
70
3
20
37
How many bits to encode the file?
2 x 70 +2 x 3 +2 x 20 + 2 x 37 =
260 bits
![Page 81: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/81.jpg)
Fixed length code
Use ceil(log2|Σ|) bits for each character
A = 00B = 01C = 10D = 11
Symbol Frequency
A
B
C
D
70
3
20
37
Can we do better?
2 x 70 +2 x 3 +2 x 20 + 2 x 37 =
260 bits
![Page 82: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/82.jpg)
Variable length code
What about:
A = 0B = 01C = 10D = 1
Symbol Frequency
A
B
C
D
70
3
20
37
1 x 70 +2 x 3 +2 x 20 + 1 x 37 =
173 bits
![Page 83: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/83.jpg)
Decoding a file
A = 0B = 01C = 10D = 1
010100011010
What characters does this sequence represent?
![Page 84: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/84.jpg)
Decoding a file
A = 0B = 01C = 10D = 1
010100011010
What characters does this sequence represent?
A D or B?
![Page 85: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/85.jpg)
Variable length code
What about:
A = 0B = 100C = 101D = 11
Symbol Frequency
A
B
C
D
70
3
20
37
How many bits to encode the file?
1 x 70 +3 x 3 +3 x 20 + 2 x 37 =
213 bits(18% reduction)
![Page 86: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/86.jpg)
Prefix codes
A prefix code is a set of codes where no codeword is a prefix of some other codeword
A = 0B = 100C = 101D = 11
A = 0B = 01C = 10D = 1
![Page 87: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/87.jpg)
Prefix tree We can encode a prefix code using a full binary tree
where each child represents an encoding of a symbol
A = 0B = 100C = 101D = 11
A
B C
D
0 1
![Page 88: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/88.jpg)
Decoding using a prefix tree
To decode, we traverse the graph until a leaf node is reached and output the symbol
A = 0B = 100C = 101D = 11
A
B C
D
0 1
![Page 89: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/89.jpg)
Decoding using a prefix tree
Traverse the graph until a leaf node is reached and output the symbol
A
B C
D
0 1
1000111010100
![Page 90: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/90.jpg)
Decoding using a prefix tree
Traverse the graph until a leaf node is reached and output the symbol
A
B C
D
0 1
1000111010100
B
![Page 91: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/91.jpg)
Decoding using a prefix tree
Traverse the graph until a leaf node is reached and output the symbol
A
B C
D
0 1
1000111010100
B A
![Page 92: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/92.jpg)
Decoding using a prefix tree
Traverse the graph until a leaf node is reached and output the symbol
A
B C
D
0 1
1000111010100
B A D
![Page 93: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/93.jpg)
Decoding using a prefix tree
Traverse the graph until a leaf node is reached and output the symbol
A
B C
D
0 1
1000111010100
B A D C
![Page 94: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/94.jpg)
Decoding using a prefix tree
Traverse the graph until a leaf node is reached and output the symbol
A
B C
D
0 1
1000111010100
B A D C A
![Page 95: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/95.jpg)
Decoding using a prefix tree
Traverse the graph until a leaf node is reached and output the symbol
A
B C
D
0 1
1000111010100
B A D C A B
![Page 96: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/96.jpg)
Determining the cost of a file
A
B C
D
0 1Symbol Frequency
A
B
C
D
70
3
20
37
![Page 97: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/97.jpg)
Determining the cost of a file
A
B C
D
0 1Symbol Frequency
A
B
C
D
70
3
20
37 70
3 20
37
n
i i ifT1
)depth()(cost
![Page 98: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/98.jpg)
Determining the cost of a file
A
B C
D
0 1Symbol Frequency
A
B
C
D
70
3
20
37 70
3 20
3723
60
What if we label the internal nodes with the sum of the children?
![Page 99: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/99.jpg)
Determining the cost of a file
A
B C
D
0 1Symbol Frequency
A
B
C
D
70
3
20
37 70
3 20
3723
60
Cost is equal to the sum of the internal nodes and the leaf nodes
![Page 100: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/100.jpg)
Determining the cost of a file
A
B C
D
0 1
70
3 20
3723
60
60 times we see a prefix that starts with a 1
of those, 37 times we see an additional 1
the remaining 23 times we see an additional 0
70 times we see a 0 by itself
of these, 20 times we see a last 1 and 3 times a last 0
As we move down the tree, one bit gets read for every nonroot node
![Page 101: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/101.jpg)
A greedy algorithm?
Given file frequencies, can we come up with a prefix-free encoding (i.e. build a prefix tree) that minimizes the number of bits?
![Page 102: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/102.jpg)
Symbol Frequency
A
B
C
D
70
3
20
37
Heap
![Page 103: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/103.jpg)
Symbol Frequency
A
B
C
D
70
3
20
37
Heap
B 3C 20D 37A 70
![Page 104: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/104.jpg)
Symbol Frequency
A
B
C
D
70
3
20
37
Heap
BC 23D 37A 70
B C
3 20
23
merging with this node will incur an additional cost of 23
![Page 105: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/105.jpg)
Symbol Frequency
A
B
C
D
70
3
20
37
Heap
BCD 60A 70
B C
3 20
23
D
37
60
![Page 106: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/106.jpg)
Symbol Frequency
A
B
C
D
70
3
20
37
Heap
ABCD 130
B C
3 20
23
D
37
60
A
70
![Page 107: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/107.jpg)
Is it correct?
The algorithm selects the symbols with the two smallest frequencies first (call them f1 and f2)
Consider a tree that did not do this:
f1
fi f2
![Page 108: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/108.jpg)
Is it correct?
The algorithm selects the symbols with the two smallest frequencies first (call them f1 and f2)
Consider a tree that did not do this:
f1
fi f2
fi
f1 f2
- frequencies don’t change- cost will decrease since f1 < fi
contradiction
n
i i ifT1
)depth()(cost
![Page 109: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/109.jpg)
Runtime?
1 call to MakeHeap
2(n-1) calls ExtractMin
n-1 calls Insert
O(n log n)
![Page 110: Greedy algorithms David Kauchak cs161 Summer 2009](https://reader034.vdocuments.site/reader034/viewer/2022051416/56649f115503460f94c23bde/html5/thumbnails/110.jpg)
Non-optimal greedy algorithms
All the greedy algorithms we’ve looked at today give the optimal answer
Some of the most common greedy algorithms generate good, but non-optimal solutions set cover clustering hill-climbing relaxation