csce350 algorithms and data structure lecture 19 jianjun hu department of computer science and...

27
CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11.

Upload: ethelbert-fisher

Post on 29-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

CSCE350 Algorithms and Data Structure

Lecture 19

Jianjun Hu

Department of Computer Science and Engineering

University of South Carolina

2009.11.

Page 2: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Hw5 summary

Average 39.97

Max:49

Common problems:

Page 3: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Pseudocode of Floyd’s Algorithm

The next matrix in sequence can be written over its predecessor

D

jkDkiDjiDjiD

nj

ni

nk

WD

nnWFloyd

return

do to for

do to for

do to for

ALGORITHM

],[],[],,[min],[

1

1

1

])..1,..1[(

Page 4: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Dijkstra’s Algorithm

Using greedy strategy to find the single-source shortest paths

In Floyd’s algorithm, we find the all-pair shortest paths, which may not be necessary in many applications

Single-source short path problem – find a family of shortest paths starting from a given vertex to any other vertex

Certainly, all-pair shortest paths contain the single-source shortest paths. But Floyd’s algorithm has O(|V|3) complexity!

There are many algorithms that can solve this problem, here we introduce the Dijkstra’s algorithm

Note: Dijkstra’s algorithm only works when all the edge-weight are nonnegative.

Page 5: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Dijkstra’s Algorithm on Undirected Graph

Similar to Prim’s MST algorithm, with the following difference:• Start with tree consisting of one vertex• “grow” tree one vertex/edge at a time to produce spanning tree

– Construct a series of expanding subtrees T1, T2, …• Keep track of shortest path from source to each of the vertices

in Ti

• at each stage construct Ti+1 from Ti: add minimum weight edge connecting a vertex in tree (Ti) to one not yet in tree– choose from “fringe” edges – (this is the “greedy” step!)

• algorithm stops when all vertices are included

edge (v,w) with lowest d(s,v) + d(v,w)

Page 6: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Example:

b

eda

c

3

4

62 5

7 4

Find the shortest paths starting from vertex a

Page 7: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Step 1:

Tree vertices: a(-,0)

Remaining vertices: b(a,3), c(-,∞), d(a,7), e (-,∞)

b

eda

c

3

4

62 5

7 4

Page 8: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Step 2:

Tree vertices: a(-,0), b(a,3),

Remaining vertices: c(-,∞)c(b,3+4), d(a,7)d(b,3+2), e (-,∞)

b

eda

c

3

4

62 5

7 4

Page 9: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Step 3:

Tree vertices: a(-,0), b(a,3), d(b,5)

Remaining vertices: c(b,3+4), e (-,∞)e(d,5+4)

b

eda

c

3

4

62 5

7 4

Page 10: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Step 4:

Tree vertices: a(-,0), b(a,3), d(b,5), c(b,7)

Remaining vertices: e(d,9)

b

eda

c

3

4

62 5

7 4

Page 11: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Step 5:

Tree vertices: a(-,0), b(a,3), d(b,5), c(b,7), e(d,9)

Remaining vertices: none the algorithm is done!

b

eda

c

3

4

62 5

7 4

Page 12: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Output the Single-Source Shortest Paths

Tree vertices: a(-,0), b(a,3), d(b,5), c(b,7), e(d,9)

b

eda

c

3

4

62 5

7 4

from a to b: a-b of length 3

from a to d: a-b-d of length 5

from a to c: a-b-c of length 7

from a to e: a-b-d-e of length 9

Page 13: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Proof of the Correctness

Using the induction method

Assume the first i steps provide shortest path to first i vertices

Prove that the i+1 step generate the shortest path from the source this this i+1-th vertex

Page 14: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

),,(

*);*,(

)*,( if

do * oadjacent t is that in ex every vertfor

*

elementpriority minimum the//Delete)(*

do 1 to0for

with ofpriority //update),,(;0

)(

null;

do in ex every vertfor

empty toqueuepriority vertex ze//initiali)(

),(ALGORITHM

*

*

v

u

uuu

uu

T

TT

T

sss

vv

duQDecrease

upuuwdd

duuwd

uV-Vu

uVV

QDeleteMinu

Vi

V

dsdsQDecreased

Q,v,dInsert

pd

Vv

QInitialize

sGDijkstra

Page 15: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Notes on Dijkstra’s algorithm

Doesn’t work with negative weights

Applicable to both undirected and directed graphs

Efficiency: • O(|E| log |V|) when the graph is represented by adjacency

linked list and priority queue is implemented by a min-heap. • Reason: the whole process is almost the same as Prim’s

algorithm. Following the same analysis in Prim’s algorithm, we can see that it has the same complexity as Prim’s algorithm.

The efficiency can be further improved if a more sophisticated data structure called Fibonacci heap is used.

Page 16: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Huffman Trees – Coding Problem

Text consists of characters from some n-character alphabet

In communication, we need to code each character by a sequence of bits – codeword

Fixed-length encoding: code for each character has m bits and we know that m≥log2n (why?)

Standard seven-bit ASCII code does this

Using shorter bit string to code a text variable-length encoding: using shorter codeword for more frequent characters and longer codeword for less frequent ones

• For example, in Morse telegraph code, e(.), a(.-), q(--.-), z(--..)• How many bits represent the first character? – we want no

confusion here.

Page 17: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Huffman Tree – From coding to a binary tree

To avoid confusion, here we consider prefix codes, where no codeword is a prefix of a codeword of another character

This way, we can scan the bit string constructed from a text from the beginning until get the first group of bits that is a valid codeword (for some character)

It is natural to associate the alphabet’s characters with leaves of a binary tree in which all the left edges are labeled by 0 and all the right edges are labeled by 1 (or vice versa)

The codeword of a character (leaf) is the list of labels along the path from the root of the binary to the this leaf

Huffman tree can reduce the total bit string by assigning shorter codeword to frequent character and longer codeword to less frequent ones.

Page 18: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Huffman Coding Algorithm

Step 1: Initialize n one node trees and label them with the characters of the alphabet. Record the frequency of each character in its tree’s root to indicate the tree’s weight

Step 2: Repeat the following operation until a single tree is obtained. Find two trees with the smallest weights. Make them the left and right subtrees of a new tree and record the sum of their weights in the root of the new tree as its weight

Example: alphabet A, B, C, D, _ with frequency

character A B C D _

probability 0.35 0.1 0.2 0.2 0.15

Page 19: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

0.25

0.1

B

0.15

_

0.2

C

0.2

D

0.35

A

0.2

C

0.2

D

0.35

A

0.1

B

0.15

_

0.4

0.2

C

0.2

D

0.6

0.25

0.1

B

0.15

_

0.6

1.0

0 1

0.4

0.2

C

0.2

D0.25

0.1

B

0.15

_

0 1 0

0

1

1

0.25

0.1

B

0.15

_

0.35

A

0.4

0.2

C

0.2

D

0.35

A

0.35

A

Page 20: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

0.25

0.1

B

0.15

_

0.2

C

0.2

D

0.35

A

0.2

C

0.2

D

0.35

A

0.1

B

0.15

_

0.4

0.2

C

0.2

D

0.6

0.25

0.1

B

0.15

_

0.6

1.0

0 1

0.4

0.2

C

0.2

D0.25

0.1

B

0.15

_

0 1 0

0

1

1

0.25

0.1

B

0.15

_

0.35

A

0.4

0.2

C

0.2

D

0.35

A

0.35

A

Page 21: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

The Huffman Code is

Therefore,

‘DAD’ is encoded as 011101 and

100110110111010 is decoded as BAD_AD

character A B C D _

probability 0.35 0.1 0.2 0.2 0.15

codeword 11 100 00 01 101

Page 22: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Notes on Huffman Tree (Coding)

The expected number of bits per character is then

2x0.35 + 3x0.1+2x0.2+2x0.2+3x0.15 = 2.25

Fixed-length encoding needs at 3 bits for each character

This is an important technique for file (data) compression

Huffman tree/coding has more general applications:• Assign n positive numbers w1, w2, …, wn to the n leaves of a

binary tree• We want to minimize weighted path length with li be

the depth of the leaf I• This has particular applications in making decisions – decision

trees

n

iiiwl

1

Page 23: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Examples of the Decision Tree

n>1

n=1 n=2

n>3

n=3 n=4

n>2yesno

yesno no yes

n=3

n=3

n=4

n=4yesno

yesno

n=2

n=1 n=2

yesno

25.04321 pppp

4.0,3.0,2.0,1.0 4321 pppp

Page 24: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Chapter 10: Limitations of Algorithm Power

1 constant

log n logarithmic

n linear

n log n n log n

n2 quadratic

n3 cubic

2n exponential

n! factorial

Basic Asymptotic Efficiency Classes

Page 25: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Polynomial-Time Complexity

Polynomial-time complexity: the complexity of an algorithm is

with a fixed degree b>0.

If a problem can be solved in polynomial time, it is usually considered to be theoretically computable in current computers.

When an algorithm’s complexity is larger than polynomial, i.e., exponential, theoretically it is considered to be too expensive to be useful.

This may not exactly reflect it’s usability in practice!

)(... 01

11

1bb

bb

b nananana

Page 26: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

List of Problems

Sorting

Searching

Shortest paths in a graph

Minimum spanning tree

Primality testing

Traveling salesman problem

Knapsack problem

Chess

Towers of Hanoi …

Page 27: CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and Engineering University of South Carolina 2009.11

Lower Bound

Problem A can be solved by algorithms a1, a2,…, ap

Problem B can be solved by algorithms b1, b2,…, bq

We may ask • Which algorithm is more efficient? This makes more sense

when the compared algorithms solve the same problem• Which problem is more complex? We may compare the

complexity of the best algorithm for A and the best algorithm for B

For each problem, we want to know the lower bound: the best possible algorithm’s efficiency for a problem Ω(.)

Tight lower bound: we have found an algorithm of the this lower-bound complexity