# 5 greedy algorithms - computer duan/class/435/notes/5_greedy_ · 1 chapter 5 greedy algorithms

Post on 12-Aug-2019

212 views

Embed Size (px)

TRANSCRIPT

1

Chapter 5

Greedy Algorithms

Optimization Problems

Optimization problem: a problem of finding the best solution from all feasible solutions.

Two common techniques: Greedy Algorithms (local) Dynamic Programming (global)

Greedy Algorithms

Greedy algorithms typically consist of

A set of candidate solutions Function that checks if the candidates are feasible Selection function indicating at a given time which is

the most promising candidate not yet used Objective function giving the value of a solution; this is

the function we are trying to optimize

Step by Step Approach

Initially, the set of chosen candidates is empty At each step, add to this set the best remaining

candidate; this is guided by selection function. If increased set is no longer feasible, then remove the

candidate just added; else it stays. Each time the set of chosen candidates is increased,

check whether the current set now constitutes a solution to the problem.

When a greedy algorithm works correctly, the first solution found in this way is always optimal.

Examples of Greedy Algorithms Graph Algorithms

Breath First Search (shortest path 4 un-weighted graph) Dijkstra’s (shortest path) Algorithm Minimum Spanning Trees

Data compression Huffman coding

Scheduling Activity Selection Minimizing time in system Deadline scheduling

Other Heuristics Coloring a graph Traveling Salesman Set-covering

Elements of Greedy Strategy

Greedy-choice property: A global optimal solution can be arrived at by making locally optimal (greedy) choices

Optimal substructure: an optimal solution to the problem contains within it optimal solutions to sub- problems Be able to demonstrate that if A is an optimal solution containing

s1, then the set A’ = A - {s1} is an optimal solution to a smaller problem w/o s1.

2

Analysis The selection function is usually based on the objective

function; they may be identical. But, often there are several plausible ones.

At every step, the procedure chooses the best candidate, without worrying about the future. It never changes its mind: once a candidate is included in the solution, it is there for good; once a candidate is excluded, it’s never considered again.

Greedy algorithms do NOT always yield optimal solutions, but for many problems they do.

Huffman Coding

Huffman codes –- very effective technique for compressing data,

saving 20% - 90%.

ASCII table

Coding Problem: Consider a data file of 100,000 characters You can safely assume that there are many

a,e,i,o,u, blanks, newlines, few q, x, z’s Want to store it compactly

Solution: Fixed-length code, ex. ASCII, 8 bits per character Variable length code, Huffman code

(Can take advantage of relative freq of letters to save space)

Example

2Z

Total BitsCodeFrequencyChar

7K 24M 32C 37U 42D 42L

111 110 101 100 011 010 001 000120E

• Fixed-length code, need ? bits for each char

6 918

21 72 96 111 126 126 360

3

Example (cont.)

37L:42 U:37 C:32 M:24 K:7 Z:2D:42E:120

0

0

0

0 0 1

1

1

1

1

1 1 0

0

E L D U C M K Z

000 001 010 011 100 101 110 111

Char Code

Complete binary tree

3

Example (cont.)

Variable length code (Can take advantage of relative freq of letters to save space)

- Huffman codes

E L D U C M K ZChar Code

Huffman Tree Construction (1)

1. Associate each char with weight (= frequency) to form a subtree of one node (char, weight)

2. Group all subtrees to form a forest 3. Sort subtrees by ascending weight of subroots 4. Merge the first two subtrees (ones with lowest weights) 5. Assign weight of subroot with sum of weights of two

children. 6. Repeat 3,4,5 until only one tree in the forest

Huffman Tree Construction (2)

Huffman Tree Construction (3)

M

Assigning Codes

Compare with: 918 ~15% less

111100

100

110

111101

11111

0

101

1110

BitsCode

2Z

37U

42L

7K

24M

120E

42D

32C

Freqchar

12 785

111

126

42

120

120

126

128

Huffman Coding Tree

4

Coding and Decoding

DEED: MUCK:

E L D U C M K Z 000 001 010 011 100 101 110 111

Char Code

E L D U C M K Z 0 110 101 100 1110 11111 111101 111100

Char Code

DEED: MUCK:

010000000010 101011100110

10100101 111111001110111101

Prefix codes

A set of codes is said to meet the prefix property if no code in the set is the prefix of another. Such codes are called prefix codes.

Huffman codes are prefix codes.

E L D U C M K Z 0 110 101 100 1110 11111 111101 111100

Char Code

LZW A universal lossless data compression algorithm created

by Abraham Lempel, Jacob Ziv, and Terry Welch. It is the basis of many PC utilities that claim to “double the

capacity of your hard drive” Unix file compression utility compress GIF image, …

Universal coding schemes, like LZW, do not require advance knowledge and can build such knowledge on- the-fly.

LZW LZW compression uses a code table, with 4096 as a

common choice for the number of table entries Codes 0-255 = ASCII table The code table contains only the first 256 entries, with

the remainder of the table being blanks Compression is achieved by using codes 256 through

4095 to represent sequences of bytes. LZW identifies them and adds them to the code table

Decoding is achieved by taking each code from the compressed file, and translating it through the code table in the same manner as encoding

LZW code table

10 BA AB BAA

11

12

13

…

LZW 1. Initialize table with single character strings 2. P = first input character 3. WHILE not end of input stream 4. C = next input character 5. IF P + C is in the string table 6. P = P + C 7. ELSE 8. output the code for P 9. add P + C to the string table 10. P = C 11. END WHILE 12. output code for P

5

LZW - Example

Example 1 BABAABAAA

Example 1: LZW Compression Step 1

BABAABAAA P=A C=empty

STRING TABLEENCODER OUTPUT stringcodewordrepresentingoutput code BA256B66

Example 1: LZW Compression Step 2

BABAABAAA P=B C=empty

STRING TABLEENCODER OUTPUT stringcodewordrepresentingoutput code BA256B66 AB257A65

Example 1: LZW Compression Step 3

BABAABAAA P=A C=empty

STRING TABLEENCODER OUTPUT stringcodewordrepresentingoutput code BA256B66 AB257A65 BAA258BA256

Example 1: LZW Compression Step 4

BABAABAAA P=A C=empty

STRING TABLEENCODER OUTPUT stringcodewordrepresentingoutput code BA256B66 AB257A65 BAA258BA256 ABA259AB257

Example 1: LZW Compression Step 5

BABAABAAA P=A C=A

STRING TABLEENCODER OUTPUT stringcodewordrepresentingoutput code BA256B66 AB257A65 BAA258BA256 ABA259AB257 AA260A65

6

Example 1: LZW Compression Step 6

BABAABAAA P=AA C=empty

STRING TABLEENCODER OUTPUT stringcodewordrepresentingoutput code BA256B66 AB257A65 BAA258BA256 ABA259AB257 AA260A65

AA260

LZW Decompression

The LZW decompressor creates the same string table during decompression.

It starts with the first 256 table entries initialized to single characters.

The string table is updated for each character in the input stream, except the first one.

Decoding achieved by reading codes and translating them through the code table being built.

LZW Decompression 1 Initialize table with single character strings 2 OLD = first input code 3 output translation of OLD 4 WHILE not end of input stream 5 NEW = next input code 6 IF NEW is not in the string table 7 S = translation of OLD 8 S = S + C 9 ELSE 10 S = translation of NEW 11 output S 12 C = first character of S 13 OLD + C to the string table 14 OLD = NEW 15 END WHILE

Example 2: LZW Decompression 1

Example 2: Use LZW to decompress the output sequence of

Example 1:

.

LZW Decompression Step 1

Old = 65 S = A New = 66 C = A

STRING TABLEENCODER OUTPUT stringcodewordstring

B BA256A

LZW Decompression Step 2 Old = 256 S = BA

New = 256 C = B STRING TABLEENCODER OUTPUT

stringcodewordstring B

BA256A

AB257BA

7

LZW Decompression Step 3

Old = 257 S = AB New = 257 C = A

STRING TABLEENCODER OUTPUT stringcodewordstring

B BA256A

AB257BA

BAA258AB

LZW Decompression Ste