heuristic & special case algorithms for dispersion problems - ravi, rozenkrantz, tayi

25
HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS - RAVI, ROZENKRANTZ, TAYI ROB CHURCHILL (THANKS TO BEHZAD)

Upload: ismet

Post on 15-Jan-2016

31 views

Category:

Documents


3 download

DESCRIPTION

HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS - RAVI, ROZENKRANTZ, TAYI. ROB CHURCHILL (THANKS TO BEHZAD). Problem:. given V = {v 1 , v 2 , …, v n }, find a subset of p nodes (2

TRANSCRIPT

Page 1: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS - RAVI, ROZENKRANTZ, TAYIROB CHURCHILL(THANKS TO BEHZAD)

Page 2: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Problem:

given V = {v1, v2, …, vn}, find a subset of p nodes (2 <= p <= n) such that some distance function between nodes is maximized

My first reaction: sounds like a Max-k Cover Problem except instead of covering, maximizing distances

Page 3: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Max-Min Facility Dispersion(MMFD)

Given non-negative, symmetric distance function w(x,y) where x, y ∈ V

Find a subset P = {vi1, vi2, …, vip} of V where |P| = p, s.t. f(P) = minx,y ∈ P{w(x, y)} is maximized.

Page 4: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Max-Avg Facility Dispersion (MAFD)

Given non-negative, symmetric distance function w(x,y) where x, y ∈ V

Find a subset P = {vi1, vi2, …, vip} of V where |P| = p, s.t. f(P) = 2/[p(p-1)] * Σx,y ∈ Pw(x, y) is maximized.

Page 5: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

MMFD & MAFD are NP-Hard

Even when distance function is a metric

Reduction to the NP-Complete problem CLIQUE.

Checks to see if a given graph G = (V, E) contains a clique of size >= J

Page 6: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Reduction

w(x, y) = 1 if they are connected, 0 otherwise

set J = p

For MAFD, if Clique(J) = 1, then there exists a clique of size J. If J < 1, then there does not exist a clique of size J

For MMFD, Clique(J) = 1 if there exists a clique of size J and 0 if there does not

Page 7: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

How do we solve these?

If we can’t get an optimal solution, we will settle for a good approximation

There are no absolute approximation algorithms for MMFD or MAFD unless P = NP

We want a relative approximation algorithm

Page 8: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Use Greedy Algorithms

“Greed is good.” - Gordon Gekko

Page 9: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Max-Min Greedy AlgorithmStep 1. Let vi and vj be the endpoints of an edge of maximum weight.

Step 2. P <— {vi, vj}.

Step 3. while ( |P| < p ) do

begin

a. Find a node v ∈ V \ P such that minv' ∈ P {w(v, v’)} is maximum among the nodes in V \ P.

b. P <— P U {v}

end

Step 4. Output P.

Provides a 2-approximation to the optimal value

Page 10: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Max-Avg Greedy AlgorithmStep 1. Let vi and vj be the endpoints of an edge of maximum weight.

Step 2. P <— {vi, vj}.

Step 3. while ( |P| < p ) do

begin

a. Find a node v ∈ V \ P such that Σv’ ∈ P w(v, v') is maximum among the nodes in V \ P.

b. P <— P U {v}.

end

Step 4. Output P.

Provides a 4-approximation of the optimal solution

Page 11: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Special Cases

For one dimensional data points, you can solve MMFD & MAFD optimally in polynomial time

For two dimensional data points, you can solve MAFD slightly more accurately than the greedy algorithm in polynomial time

2-D MMFD is NP-Hard, 2-D MAFD is open

Page 12: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

1-D MAFD & MMFD

Restricting the points to 1-D allows for a dynamic programming optimal solution in polynomial time

O(max{n log n, pn})

V = {x1, x2, …, xn}

Page 13: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

How it works

Sort the points in V (n log n time)

w(x, y) = distance from x to y

OPT(j, k) = the solution value with k points picked from x1, …, xj

OPT(n, p) = optimal solution for the whole set

Page 14: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Recursive Statement

OPT(j, k) = max {OPT(j-1, k), OPT(j-1, k-1) U xj}

Page 15: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Runtime MAFD

OPT(j-1, k) and OPT(j-1, k-1) are constant time lookups

Store the representative of OPT(j-1, k-1) in μ(j-1, k-1)

OPT(j-1, k-1) U xj is constant time:w(xj, μ(j-1, k-1)) + OPT(j-1, k-1)*(k-1) / k = average distance

Page 16: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Runtime MMFD

Store the most recently picked element in the optimal solution in f(j-1, k-1)

This gives a constant time computation of OPT(j-1, k-1) U xj:min {OPT(j-1, k-1), w(xj, f(j-1, k-1))}

Page 17: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Runtime

Both are O (nlogn + pn) since their computation times per iteration are constant if the right information is stored

Page 18: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

The Dynamic Programming Algorithm

(*- - In the following, array F represents the function f in the formulation. - -*) Step 1. Sort the given points, and let {x, x2, …, xn} denote the points in increasing order. Step 2. for j := 1 to n do F [0, j] <— 0; Step 3. F [1,1] <— 0. Step 4. (*- - Compute the value of an optimal placement - - *)

for j := 2 to n do for k:= 1 to min (p,j) do

begin t1 <— F[k, j - 1] + k(p - k)(xj - xj-1); t2 <— F[k - 1, j - 1] + (k - 1)(p - k + 1)(xj - xj-1); if t1 > t2, then (*- - do not include xj - - *)

F[k, j] <— t1;else (*- - Include xj - - *)

F[k, j] <— t2; end;

—>

Page 19: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

The Algorithm cont.

Step 5. (*- - Construct an optimal placement - - *) P <— {x1}; k <— p; j <— n;

while k > 1 do begin

if F[k, j] = F[k - 1, j - 1] + (k - 1)(p- k + 1)(xj - xj-1), then (*- - xj to be included in optimal

placement - - *)begin

P <— P U {xj}; k <— k - 1; end;

j <— j - 1; end;

Step 6. Output P.

Page 20: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

2-D MAFD Heuristic

Uses 1-D MAFD algorithm as the base

Gives a π/2-approximation

Page 21: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

How it works

given V = {v1, v2, …, vn}

vi = {xi, yi} (coordinates)

p <= n = |V|

Page 22: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

The Algorithm

Step 1. Obtain the projections of the given set V of points on each of the four axes defined by the equations

y = 0, y = x, x = 0, and y = -x

Step 2. Find optimal solutions to each of the four resulting instances of 1-D MAFD.

Step 3. Return the placement corresponding to the best of the four solutions found in Step 2.

Page 23: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Relation to Study Group Formation & High Variance Clusters

These create one maximum distance group, not k max distance groups

If you want k-HVclusters, set p = n/k and run the algorithm (whichever you choose) k-1 times (last n/k points are the last cluster

This could guarantee that the first couple of groups have a high variance, but not the later ones

Page 24: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

Study group formation

Most study groups only study one subject

If you wanted to assign students one study group per subject, you could simplify their attributes to one dimension per subject and solve each subject optimally.

Instead of the exact algorithm described, minimize the distance from the mean, but stay on the opposite side of the mean as the teacher node

Maybe have positive & negative distances to reflect which side of the mean a point is on

This would ensure that people who would learn (under mean) would be picked before people who would not learn

You want multiple study groups and highest amount of learning

Not sure how to do this…

Page 25: HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS -  RAVI, ROZENKRANTZ, TAYI

References

S.S. Ravi, D.J. Rosenkrantz, and G.K. Tayi. 1994. Heuristic and Special Case Algorithms for Dispersion Problems.