integration of artificial intelligence and operations research for

Post on 10-Feb-2017

220 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Integration of Artificial Intelligence and Operations Research Techniques

for Combinatorial Problems

Carla P. GomesCornell University

gomes@cs.cornell.edu

Ken McAloon and Carol TretkoffILOG

{mcaloon,tretkoff}@ilog.com

2

AI, OR, and CS

AI OR

CS

3

Integration of Artificial Intelligence &Operations Research

TechniquesAI

RepresentationsConstraint Languages

Logic FormalismsObject-Oriented Prog.

Bayesian NetsRule Based Systems

•  •  •Tools

Constraint PropagationSystematic SearchStochastic Search

•  •  •Pros / Cons

Rich RepresentationsComputational Complexity

ORRepresentations

MathematicalModeling LanguagesLinear & Non-linear

(In)Equalities•  •  •Tools

Linear ProgrammingMixed-Integer Prog.Non-linear Models

•  •  •Pros / Cons

More Tractable (LP)Primarily Complete InfoLimited Representations

Combinatorial Problems

Planning Scheduling

THE CHALLENGEAI OR

UNIFY APPROACHES TO:

SCALE UP SOLUTIONSHANDLE UNCERTAINTYANALYZE COMPLEXITY

(phase transition)FRAGILE

EXPLOIT PROBLEM STRUCTURE

INCREASE ROBUSTNESS

31 - 45: ACPOWER? 0 NUM-UNAV-RESS 1UNAV-RES-MAP (DIV2 D24BUS-3 D24-2 D24-1) (ACPLOSS D24BUS-3 D24-2

ROME LABORATORY OUTAGE MANAGER (ROMAN)

Parameters Load Run Gantt Charts Utilities Parameters Load Run Gantt Charts Utilities

AC-POWER StatusAC PowerDIV1DIV2DIV3DIV4

0 10 20 30 40 50 60 70 80 90

GoalStart

4

Outline

I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers

III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions

5

I. Short OR Overview

6

Outline for Linear Programming and Integer Programming

• Standard Form of LP and a Simple Example • Geometric Interpretation of LP• Complexity issues• MIP• Example: Fast Food• Example: Capacitated Warehouse• Example: 911

7

Outline

1. Short Overview of OR

2. Constraint Programming3. Cooperating Solvers

4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard

Combinatorial Problems 6. Conclusions

8

Optimization Technology Evolution

Dispatch Rules

1960 1970 1980 1990

SA, GA, Tabu

CPMPERT

Constraint-based Scheduling

19981947

Primal Simplex LP

ParallelLP/MIP

ConcurrentScheduling

Interior Point

ConstraintPropagation

Large IPsMIP

ShiftingBottleneck

First CP Systems

CooperatingSolvers (LP/CP)

Global constraints

Barrier LPBarrier Crossover

Dual Simplex Implementation

Dual Simplex

9

1. Short OR Overview

10

Outline for Linear Programming and Integer Programming

• Standard Form of LP and a Simple Example • Geometric Interpretation of LP• Complexity issues• MIP• Example: Fast Food• Example: Capacitated Warehouse• Example: 911

11

An LP Story

A factory can produce n products from m partsFor product j it needs aij units of part iThere are bi units of part i availableEach unit of product j sold earns cj

Amount of each product to make is unknown xj 0Each part i determines a constraintai1 x1 + … + ain xn bi

Obvious solution: do nothing

Better: maximize c1 x1 + … + cn xn

12

Standard Forms of LP

A linear program (LP) in standard form (Dantzig 1947)

max cTx subject to Ax b x 0

Input data: c (n x 1), A (m x n), b (m x 1).Variables: x (n x 1)

13

Standard Forms of LP

// The objective functionmax c1 x1 + … + cn xn

// The constraintssubject to

a11 x1 + … + a1n xn b1

...am1 x1 + … + amn xn bm

x1 0 , … , xn 0

14

Standard Forms of LP

• In OR emphasis is on optimality

• Solution means optimal solution

• Feasible solution means solution in the ordinary sense

15

Standard Forms of LP

Interpretation of standard form:• xj = amount of product j to make• cj = revenue per unit product j• bi = available amount of component i• aij = units of i used per unit of j produced

The constraints “say”: aijxj = units of i used by j= units of i used bi

16

What are models?A model is a data-independent abstraction of a problemA model lets you write down the mathematical representation of a

model independently of the data

ProjectModel Data

OneProblemInstance

17

Products Could be Jewelry

Products: Rings and EarringsComponents: Gold and Diamonds

One ring requires 3 units of Gold, and 1 DiamondOne set of earrings requires 2 units of Gold, and 2 Diamonds

Total Gold and Diamonds are limited

Profit is different for Rings than for Earrings

Products = { rings, earrings };Components = { Gold, Diamonds };

demand = [ [3, 1], [2, 2] ];

stock = [150, 180];

profit = [60, 40];

18

Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl

Components: Nitrogen, Hydrogen, Chlorine

One unit of Gas requires 1 unit of Nitrogen, 3 units HydrogenOne unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit

of Chlorine

Total Nitrogen, Hydrogen, Chlorine is limited

Profit is different for Gas than Chloride

Products Could be Chemicals

Products = { gas, chloride };

Components = { nitrogen, hydrogen, chlorine };

demand = [ [1, 3, 0], [1, 4, 1] ];

stock = [50, 180, 40];

profit = [30, 40];

19

The Problems Have One Model

enum Products ...;enum Components ...;

float+ demand[Products, Components] = ...;float+ profit[Products] = ...;float+ stock[Components] = ...;

var float+ production[Products];

maximize sum (p in Products) profit[p] * production[p]

subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c]};

Data

DecisionVariables

Objective Function

Constraints

20

OR Modeling Systems

• OPL• AMPL• 2LP• AIMMS• GAMS• MPL• ILOG Planner• etc

21

The Dual

The dual linear program (von Neumann 1947);

min yTbsubject to yTA c y 0

Variables y (m x 1)

Awesome Symmetry - The dual of the dual is the primal

22

Rows and Columns Exchanged

min b1 y1 + … + bm yn

subject to

a11 y1 + … + am1 ym c1

...a1n y1 + … + amn ym cn

y1 0 , … , ym 0

23

Duality Theorem

Theorem: min yTb = max cTx

• Consequence: This turns optimality problem into a feasibility problem in x and yAx b x 0yTA cT

y 0 yTb = cTx

• Consequence: Enumeration not needed to verify optimality

24

Duality Theorem

• Sensitivity Analysis• Consequence: The solution values y* for the y variables

yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b

yi * = Z / bi where Z is the optimum

Reference: McAloon and Tretkoff [1996] Wiley

Duality

Two different views of the same phenomenon

Point vs Set

Arc vs Node

Momentum vs Position

Vector vs Hyperplane

Landlord vs Renter

26

Simplex and Barrier

• The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function

• The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope

27

Geometric Interpretation of LP

X

Y

Max: Xsubject to:

-X + Y <= 4X + 4*y <= 362*X + y <= 23X + Y >= 4Y >= X + 10

(0,4)

(4,0) (8,0)

(10,3)

(4,8)

Barrier

Simplex

28

Complexity of Linear Programming

Simplex MethodWorst-case --- exponential (Klee and Minty 72)Practice --- good performance

Ellipsoid MethodKhachian’s Ellipsoid Method Worst-case --- polynomialPractice --- poor performance

29

Complexity of Linear Programming

Interior Point Methods or Barrier Methods“Karmarkar’s” (and variants) Method

Worst-case --- polynomialPractice --- good performance

30

Complexity of Linear Programming

• Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice.

• Which method performs best is problem dependent.

31

Success Stories

• Industrial PlanningGiven current resources, decide what to produce in what quantity

• Supply Chain ManagementMultiperiod planning models that link flow from one period to the next

• Network FlowHow best to route goods across a network

32

Assumptions of Linear Programming

• Linearitywhen violated: ( xy = 50)Nonlinear programming

• Continuity when violated: (x integral)(Mixed) Integer programming

33

Assumptions of Linear Programming - continued

• No Disjunctive Constraintswhen violated: (x 100 or x 0)Disjunctive programmingAdditional 0-1 variables and Big M constraints

• Certaintywhen violated: (cost c is a random variable)Stochastic programming

34

Search and MIP

• In order to deal with variables that must have integer values in the solution, a search must be performed.

• Mixed Integer Programming problems are combinatorial optimization problems and are NP hard

• feasibility is NP-Complete• verifying optimality is co-NP-Complete

35

MIP and Combinatorial Optimization

• These problems have been attacked by both the AI and OR communities.

• In AI, these problems are attacked as CSPs or as Planning Problems.

• In OR, they are done as MIPs and use linear relaxation to help guide the search.

• The overriding idea in each case is to limit search.

36

Integer Program: All Integer Points in Region

37

Cut to Create Integer Vertex

Integer Vertex

38

Example - Fast Food

• Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet?

• If so, what is the least he can do it for?

39

Nutritional Requirements

• At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron

• At least 55 grams of protein• At most 3000 milligrams of sodium• At most 30% of the calories can come from fat

• Nutritional information is available from fast food outlets

40

College Student’s Requirements

• At least 2000 calories a day• No more than 3 servings of any one food• Milk only with cereal and not as a stand-alone

drink

41

Fast Food - MIP Model

• We will have variables Servk to represent the number of servings of item k in the plan.

• The variable Servk will have to take an integer value for the solution to be valid.

• The objective function: Z for cost

42

Fast Food - MIP Model

• Let foodk,j represent the percent of RDA of nutrient j in a serving of item k

• The for each nutrient j, we have a constraint

foodk,j Servk 100 k

43

Fast Food - MIP Model

• Let sodiumk represent the amount of salt in a serving of item k

• For salt we have the constraint sodiumk Servk 3000 k

• Similarly for fat

44

Fast Food - MIP Model

• Let costk represent the cost of a serving of item k

• For the objective function we have the defining constraint

costk Servk = Z k

45

Fast Food - Solution

• With a MIP solver and a way to input these constraints we ask for

• a solution that makes the variables Servk integral

• and which minimizes Z

46

MIP Solution Technique

• What the MIP solver does is to carry out a branch and bound search guided by

• the linear relaxation– the solution to the problem with the integrality

requirements relaxed

• Initialize the global variable best_so_far to 1000 (or something else very big).

47

At a Node

• Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if

z* best_so_far ,

• If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node.

48

Branching at a node

• Choose a variable Servk whose value s* is not integral.

• Typical heuristic: most non-integral variable

• Create two child nodes,• add Servk floor(s*) • add Servk ceil(s*)

49

Good News

• The linear relaxation can prune nodes before all variables Servk are forced to be integral.

• Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why

• A solution to the LP is at a vertex• A vertex is defined as the simultaneous

solution of the equality form of n linearly independent constraints

• Many of these constraints are integer bounding constraints yielding X = integer

50

Arboreally Speaking

• Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A*

• If the linear relaxation is tight

| z*linear - z*integral | is relatively small

then z*linear is an excellent evaluation function

51

Answer - Fast Food

Total cost is 8.71

Buy 3 burgersBuy 2 friesBuy 3 honeysBuy 1 yogurt...

52

Example - Fixed Cost

• Warehouses must be rented in order to supply stores and we must decide which to use

• For each store j we know its monthly demand dj

• For each warehouse i we know its capacity ki

• For each warehouse i we know the fixed cost to run it each month fci

• For each pair i, j we know the monthly cost cij of supplying j from i

53

Example - Fixed Cost

• Xij is the fraction of store j’s demand met by i• Xij 1

• Yi is a “fuzzy” boolean• it will be 1 if the warehouse is rented• 0 if it is not rented

• Yi 1

54

Example - Fixed Cost

• Each store must be supplied X ij = 1 i

• Warehouse capacity can not be exceeded dj Xij ki j

• Tighter dj Xij ki Yi j

55

Example - Fixed Cost

• Objective function

fci Yi + cij Xij

• This yields a MIP with 0-1 variables Yi

56

Branch and Cut: An Enhanced Solution Method

• Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation

Xij Yi

• Add at a node if violated by solution to linear relaxation

• Powerful method - will solve the Imperial College OR lib CW problems very easily

57

Example - Call 911

• PCTs answer the phone 24 hours a day, 7 days a week.

• It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate.

• Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours.

58

Example - Call 911

• Each PCT has a work week of 5 days followed by 2 days off.

• Want to meet the demand with minimal or near-minimal number of PCTs.

• So need to determine how many PCTs start their work week at each hour h of the week

59

Modeling 911

• A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0 h < 168.

60

Modeling 911

• A continuous variable Z will represent the objective function

Pcth = Z h

• There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed.

61

Modeling 911

• For this constraint we need to represent the number of workers who are on duty at time h

• Certainly, those who start the week at time h are here, as are those who started the week at time h - 1

• And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break.

62

Modeling 911

• This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h

Pctw bh w

63

Call 911 solved with progressive roundoff

int b[168] = { // New York City 91130,24,18,15,14,14,15,25,34,36,38,40,41,43,46,57,57,59,61,59,55,50,45,38,32,25,20,17,15,13,17,25,32,35,38,40,42,43,47,58,57,57,59,57,55,52,47,41,33,25,20,17,15,13,15,25,32,33,37,39,42,43,47,57,56,57,57,56,53,50,47,41,34,27,22,19,16,15,16,25,31,35,37,40,44,45,48,57,57,56,58,56,53,53,46,41,34,28,23,19,16,15,17,25,33,37,39,42,45,47,51,59,58,60,61,61,57,56,57,55,48,41,35,30,26,20,18,22,26,32,42,46,49,53,54,56,56,56,59,59,57,57,56,56,52,46,41,34,29,23,18,19,25,31,36,41,46,50,52,53,52,53,54,53,50,49,45,40

};

64

Modeling 911

• Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small.

• The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer.

• The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers.

65

Modeling 911

• For this and many other applications, heuristics can be used to develop good solutions

• Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc.

66

Solving the Integer Problem

main() // Planner Code{IlcInitFloat();IlcManager m(IlcNoEdit);IlcLinOpt simplex(m);IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168);int i,j,k,h,n;

67

Solving the Integer Problem

// Pctw bh w

for(h=0;h<168;h++) { // for each hour of 168 in weekfor(j=0;j<168;j++)coeffs[j] = 0;for(k=0;k<5;k++) // for each of 5 daysfor(j=k*24;j<k*24+8;j++) // for each of 8if (j!=(k*24+4)) // hourscoeffs[(h+168-j)%168] = 1;simplex.add(IlcScalProd(coeffs,Pct) >= b[h]);}

68

Solving the Integer Linear Problem

IlcFloatVar Z = IlcSum(Pct);// Objectivesimplex.setObjMin(Z);for(i=0;i<168;i++) { //Progressive roundoffn =ceil(simplex.getCurrentValue(Pct[i]));// Fix variable and re-optimizesimplex.add(Pct[i] == n);}m.out() << “Number of Pcts needed is “ << Z << endl;m.end();

}

69

Solution

• This code finds a solution with 208 workers in a couple of seconds. The optimum is 207.

• The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution

• [Bartholdi,Ratliffe,Orlin]

70

2. Constraint Programming

LP/MIP is Beautiful, except when

• Variable domain information is important to the search strategy

• especially critical in scheduling• The problem variables range over symbolic

entities and there are lots of symmetries• timetabling

• The MIP representation can be too verbose or awkward

• configuration• There are just too many constraints

e.g. vehicle routing

72

Mathematical Basis of Constraint Programming (CP)

The Constraint Satisfaction Problem:• Suppose a finite set of variables is given

and with each variable is associated a non-empty finite domain.

• A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk) D1 x …x Dk.

• A constraint satisfaction problem (CSP) is given by a finite set of constraints.

• A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied.

73

Domain Reduction

• In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints.

• With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint.

• Accelerates convergence toward a solution

• Detects infeasibility

74

Constraint Propagation

• The other key issue is communication among the constraints or subproblems.

• The basic method used is called constraint propagation which links the constraints through their shared variables.

• The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints.

Monsieur Jordan Phenomenon

• Like prose, you have been doing constraint propagation all your life.

– Crossword puzzles

• Incomplete and so backtracking is needed– NY Times Sunday Crossword– Optical Illusions

• Origin: Vision analysis (Marr,Waltz et al)

76

Strengths of Constraint Programming

Constraint Programming provides a rich Rich

• Rich representation language.• CP variables naturally represent problem

entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT.

• Opportunity to choose a good heuristic for the solution strategy.

77

Which Method for Which App?

ProductMix

LP

ProductionPlanning

MIP

DistributionPlanning

MIP

Scheduling

ConstraintBased

Scheduling

Dispatching

CPLocal search

Configuration

CP Technology

Application

Linear => Disjunctive Constraints

Strategic => Operational Optimization

78

3. Cooperating Solvers

First Stop

CP/CP

80

Mother of All Examples - N Queens

• Do we think in terms of queens • Where do we place this queen ?

• Do we think in terms of squares • Will this square contain a queen ?

• These views are dual to one other

The Primal View

For each queen assign it a square

Place this queen in this square ?

The Dual ViewFor each square decide whether it will

have a queen

Place a queen in this square ?

83

The Primal Model

In which row do we place q[j] - the queen in column j

The constraints q[i] != q[j]q[i] - q[j] != i - jq[i] - q[j] != j - i

Note: no alldifferent constraint

84

Yet Another duality - rows vs columns

In which column do we place qq[i] the queen in row i

The constraints are the sameqq[i] != qq[j]qq[i] - qq[j] != i - jqq[i] - qq[j] != j - i

85

The Relationship

Can link them as inverse functions:

q[qq[i]] = iqq[q[j]] = j

The constraint propagation

i leaves domain of q[j] iff j leaves domain of qq[i]

86

In this primal/dual model

• Apply first-fail to q[i]

• Lo and behold one-third fewer fails

(Example from Jean Jordan’s thesis)

87

88

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

89

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

Q

X

X

X

X

X XX

X

X

X

90

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

Q

X

X

X

X

X XX

X

X

X

Q

X X

X

X

91

So …

The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint

The arc consistency of this global constraint is non-trivial to maintain

Network flow algorithmsflow goes from values to variableseach variable has unit demand and capacity

92

Remarks

• An IP model will encode the first dual solution

Will this square contain a queenx[i][j] = 0 or x[i][j] = 1

• A disaster beyond 30 queens• network structure on rows and columns lost

• Another example - sports scheduling

93

Constraints and Indices

• In IP symbols are represented by indices as opposed to values for variables.

• nurses, teams • Paris-St-Germain plays Manchester United on

day kxijk {0-1} to represent “team i plays team j on day

k”

• You can’t put symmetry breaking and other constraints on indices.

Second Stop

CP/IP

95

CP Is Powerful, But ….

• Sometimes, inconsistencies can be overlookedX - Y 12X + Y 10X in [1..20]Y in [1..20]

• Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution

• but an LP solver would spot this

96

2 Dimensional Bin Packing

• Application for the Automobile Industry built by Greg Glockner

97

2 Dimensional Bin Packing

• The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed.

The actual application involves circuit boards

• There are two complete models, one a CP model and the other an IP model.

The CP model directs the search The LP relaxation prunes the search space by

detecting infeasible nodes

98

2-D Bin Packing

Arrange circuit boards onto raw materialBoards may be rotatedUse same number of each board

Objective: minimize scrap

Classic combinatorial optimization problem

99

Solving 2-D Bin Packing

Use CP to generate partial solutions (nodes)Restrict placement to reduce fragmentation of

blank spaceUse tight LP to test feasibility

If any partial solution is infeasible in the LP, prune the tree immediately

CP constraints reduce the tree widthLP allows us to prune quickly

100

2 Dimensional Bin Packing

• As the search tree is traversed, the two models are in sync.

• Note that the variables used in the 2 models are disjoint

• The two models are dual to each other• The IP sees the model from the point of view

of the board, the large rectangle• The CP sees the model from the point of view

of the small rectangles• Solutions are obtained in minutes

101

2DBP: Basic CP Formulation• Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile

• Basic constraints:– Disjunctive constraints to prevent overlapping tiles

xi + wi xj yi + hi yj

xj + wj xi yj + hj yi

– Constraints to count the number of each tile type

Tile-oriented formulation

102

2DBP: Basic IP Formulation

Let xijnt = 1 if tile n of type t is in position (i, j)The constraints are:

tnjix

jix

tnx

ijnt

hjjwiijjiitnji

ntji

jiijnt

tt

,,,}1,0{

,1

,1

,,,:,,,

,

Grid-oriented formulation

103

2DBP: LP Issues

• The LP is large• The LP exhibits significant primal degeneracy• The LP exhibits significant dual degeneracy

104

2DBP: LP Issues

• The simplex algorithm cannot solve the LP• There is no way for a MIP solver to solve the IP

as such• The barrier method can solve the LP

105

2DBP: Summary

• CP as master problem– Orders tiles – Places tiles by position, then type– Selects tile type by frequency to scatter

tiles throughout the bin– Uses a one-ply lookahead constraint to

limit the position of following tile• LP relaxation prunes the CP search space

– Checks whether the partial solution will lead to an infeasible instance

• Use idiomatic formulations for CP and IP

106

2DBP: Remarks

• The CP fixes significant numbers of variables at each node

• The LP pre-processor greatly simplifies the LP

• Therefore, the lack of incrementality of the barrier method does not cost us

107

2DBP: Cooperative Algorithm Demo

108

Last Stop

• Constraint Programming and Local Search cooperation

• Another example of duality in action

109

CP/LS

Parallel machines with set-up times

Ready timesDues datesSplittable jobsRogue machines

Objectivesmeet due datesminimize setup costs

110

Two Phase Cooperation

Phase I - the Primal (Work on first objective)

• Configure and schedule the jobs

– Use constraint based scheduling

111

Two Phase CooperationMachines morph into trucks

112

Two Phase Cooperation

• Phase 2 - the Dual (Work on second objective)

• Schedule the trucks

– Use Lin, Lin-Kernighan, tabu etc

113

Parallel Machines: Cooperative Algorithm Demo

114

IC Park Example

• Hoist Scheduling (Rodosek and Wallace)

• The original model is an IP

• The CP model is “the same”

• CP guides search, LP relaxation and CP share pruning duties

• No apparent duality

115

Remarks

• One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms

• IP/LS is just around the corner

• IP/IP cooperation is hard because one can’t formulate truly dual views

– either simply not there– or too verbose

counterexamples welcome

116

4. DISJUNCTIVE PROGRAMMING

117

Disjunctive Linear Programming

• An extension of Mixed Integer Programming

• A union of polyhedral sets (feasible regions) is called a disjunctive set.

118

Disjunctive Set

119

Disjunctive Linear Programming

• The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem.

• The solution set of the disjunctive programming problem is

Fij

i<M j<N

120

Disjunctive Linear Programming Examples

• Semi-continuous variableseither X >= 100or X == 0

• Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable

121

Solution Set Inside Initial Region

122

Disjunctive Linear Programming Examples - continued

• Bollapragada, Ghattas and HookerTruss structure design problem

Branches directly on alternatives dictated by Hooke’s Law

• WyattDisjunctive programming and mean absolute

deviation models (MAD) for portfolio optimization

Extends Bender’s decomposition to disjunctive linear programs

123

Disjunctive Linear Programming continued

• Balas, Cornuejols and CeriaGenerating cuts for disjunctive programming

problems.

• McAloon and TretkoffBasic mathematical results: Optimization and Computational Logic, Wiley

124

Disjunctive Linear Programming continued

• Dealing with the disjunctive part requires search.

• This requires an engine which is not available in MIP packages

• Also the linear relaxation is not as tight and the evaluation function is not as faithful

• The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers

• Beringer and DeBacker for MIP

125

To Keep It Simple

GERALD + DONALD = ROBERT

An AI classicNewell and Simon

Assignment problem + 1 constraint

Surprisingly hard for MIP solversCPLEX MIP takes 1 minute and 29048 nodes (on

Sun Enterprise) to find a feasible integer solution

126

The Disjunctive Program

• One constraint for the equation • 100000 G + … + D = 100000 R + … + T

• For each variable X among G,…,T• X = 0 or X = 1 or … or X = 9

• For each pair X, Y• X Y-1 or Y X-1

127

Solution Set: SOME of the Integer Points in the Region

128

The Twin Variables for Cooperating Solvers

• Integer variables for the letters 0 g, e, r, a, l, d, o, n, b, t 9

• With continuous doppelgangers0 G, E, R, A, L, D, O, N, B, T 9

129

The Variables

• One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver

100000 G + 10000 E + 1000 R + … + D +100000 D + 10000 O + 1000 N + … + D =100000 R + 10000 O + 1000 B + … + T

130

The Variables

• One CSP constraint on the integer variables posted to a discrete constraint propagation engine

AllDifferent(g, e, r, a, l, d, r, n, b, t )

131

The Search

• Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely

• The branching strategy is guided by the linear relaxation on the continuous variables

• if there is a non-integral variable X, branch on it

X floor(X*) or

X ceil(X*)

132

The Search

• If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack

• If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack

133

The Search

• New wrinkle• The solution to the linear relaxation might

have all variables integral - but the AllDifferent constraint can be violated by this set of values

• In this case, branch to keep them apart• either X Y - 1• or Y X - 1

134

The Variables

void main(){

IlcInitFloat(); IlcManager m(IlcNoEdit);

IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9),G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9);

IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T);

// Continued on next slide

135

The Constraints

m.add(IlcAllDiff(vars,IlcWhenValue));

IlcLinOpt simplex(m); simplex.add(

100000*R + 10000*O + 1000*B + 100*E + 10*R + T==

100000*G + 10000*E + 1000*R + 100*A + 10*L + D +

100000*D + 10000*O + 1000*N + 100*A + 10*L + D ,

IlcTrue // Post to Solver as well);

136

The Search for solutions

m.add(Generate(m,simplex,vars)); // Search strategy

if (m.nextSolution()) { // Find a solutionm.out() << " solution found " << endl;;

}m.printInformation();m.end();

}

137

Branch if a variable is non-integer

ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex);

if (varIndex >= 0) // There is a non-integer variablereturn IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex),

this);

138

Is integer relaxation a solution ?

IlcManager m = getManager();if(m.solve(TestIntegerRelaxation(m,simplex)))

return 0;

139

Find two variables with same value

IlcInt j;for(i=0;i<vars.getSize()-1;i++) {

if (vars[i].isBound()) continue; // Can’t both be boundIlcInt n = simplex.nearest(simplex.getCurrentValue(vars[i]));for(j=i+1;j<vars.getSize();j++) {

IlcInt m =

simplex.nearest(simplex.getCurrentValue(vars[j]));if (m == n) break;

} if (j< vars.getSize()) break;

}

140

Branch to push them apart

// j and i are the indices of two variables with same current value

return IlcAnd(

IlcOr(Smaller(m,vars[i],vars[j],simplex),Smaller(m,vars[j],vars[i],simplex)),

this // Recursion);

}

141

Pushing two variables apart

ILCGOAL3(Smaller,IlcIntVar,x,IlcIntVar,y,IlcSimplex,simplex){

simplex.add(x <= y-1,IlcTrue);return 0;

}

142

Testing the integer relaxation

ILCGOAL1(TestIntegerRelaxation, IlcSimplex, simplex){

simplex.trySolution();return 0;

}

143

Results

• ILOG Solver/Planner finds a solution in 6 nodes (.29 seconds on laptop)

• Straightforward ILOG Solver finds a solution in 8024 nodes (1.8 seconds on a laptop)

• Again, CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution

144

Example: The Dutch TrainsScheduling intercity trains

Amsterdam,Rotterdam,Roosendaal,Vlissengen

Without coupling constraints, multi-commodity integer flow problem

With coupling constraints, a DLP with an integer relaxation

Additional logic handled directly in 2LP with CPLEXDisjunctive Programming and Cooperating Solvers,

CSTS 98 (Kluwer, edited by D. Woodruff)

Conclusions

• CP and MIP are powerful techniques that can solve many combinatorial problems

• Each has preferred formulations• Can get even greater benefits when combining

CP and IP algorithms

146

Recent and Current Work

Beaumont

Beringer, DeBacker

Balas, Ceria, Cornuejols.

Wallace, Rodosek, Schrimpf

Heipke, Colombani

Bockmayr

McAloon, Tretkoff, Wetzel

147

III. Exploiting Randomization to Solve Hard Combinatorial

Problems

148

Background

Combinatorial search methods often exhibita remarkable variability in performance. It is common to observe significant differences between:

- different heuristics- same heuristic on different instances- different runs of same heuristic with different seeds (stochastic methods)

149

Main Claim

One can take advantage of the extreme variability of combinatorial search methods:

One can One can improve the performance of a improve the performance of a deterministic complete methoddeterministic complete method, by , by introducing a introducing a stochastic elementstochastic element, while , while maintaining maintaining completeness.completeness.

We’ll explain We’ll explain WHYWHY that is the case. that is the case.

150

A Structured Benchmark Domain for Studying the

Distributions of Search Methods Stochasticity in Search ProceduresIntriguing Properties of Complete Backtrack Style AlgorithmsConsequences for Algorithm Design - Rapid Randomized RestartsPortfolio of Algorithms

151

Structured Benchmark Domain

152

Study of local and systematic search methods has been driven by:Random instance distributions (Hogg et al. 96). Limitation: lack of structure that characterizes realistic problems;

Highly structured problems (Fujita at al. 93). Limitation: “too much” structure.

We propose a benchmark domain that We propose a benchmark domain that bridges the gap between purely random bridges the gap between purely random instances and highly structured problems.instances and highly structured problems.

Background

Gomes and Selman 1997 - Proc. AAAI-97

153

Defn.: a pair (Q, *) where Q is a set, and * is a binary operation on Q such that a * x = b ; y * a = bare uniquely solvable for every pair of elements a,b in Q.

The multiplication table of its binary operation defines a latin square (i.e., each element of Q appears exactly oncein each row/column).

Example:Quasigroup of order 4

Quasigroups

154

Given a partial latin square, can it be completed?

Example:

Quasigroup Completion Problem (QCP)

155

Quasigroup Completion Problem A Framework for Studying Search

NP-Complete (Colbourn 1983, 1984; Anderson 1985).

Has a structure not found in random instances.

Leads to interesting search problems when structure is perturbed.

The study of this problem led us to identifythe unusual distributions of combinatorial search (Gomes, Selman & Crato --- CP97)

156

Aside: Applications of Quasigroups

Design of statistical experimentseliminating data dependencies

Scheduling/Timetabling (Anderson 1992)

completing a schedule given a set of pre-defined events

Automated theorem proving (Fujita et al. 1993)

existence vs. non-existence of quasigroups with intricate mathematical properties

157

Example: Scheduling of Drug Experiment

Given 5 different drugs, test the effects of the different medications on 5 different subjects over different days of the week.

Use constraint:No two people get same brand on the same day (eliminate bias for day of the week).

158

Quasigroup Completion SU

BJE

CT

DAYMon. Tues. Wed. Thurs. Fri.

Tim

Sue

Frank

Teresa

Todd

Tylenol Aleve Bayer ExhedrinExhedrin Advil

Aleve Bayer Exhedrin Advil Tylenol

Bayer Exhedrin AdvilAdvil Tylenol Aleve

Exhedrin AdvilAdvil Tylenol Aleve Bayer

Advil Tylenol Aleve Bayer Exhedrin

(*) Pre-assigned(*) Pre-assigned

159

QCP has a natural formulation as a ConstraintSatisfaction Problem

variable for each NxN entryconstraints capture row/column requirementvariable assignments capture pre-assigned values

160

How does the difficulty of QCP vary with the fraction of pre-assignment?

161

Fraction of pre-assignment

Med

ian

num

ber

of b

ackt

rack

s (lo

g)

Overconstrained areaUnderconstrained

area

Critically constrained area

162

Complexity Graph shows (up to order 20):

curve peaks around 42% of pre-assignment ---

critically constrained area.critically constrained area.

under-constrainedunder-constrained and over-over-constrainedconstrained areas are easier.

163

Directly related to the peak incomputational difficulty is the so-called phase transition graph forthe QCP problem.

164Fraction of pre-assignment

Frac

tion

of U

nsol

ved

case

s

Almost all unsolvable area

Almost all solvable area

Phase transition area

165

Phase Transition

QCP Phase Transition --- threshold phenomenonthreshold phenomenon from almost all solvable to almost all unsolvablefrom almost all solvable to almost all unsolvable --- occurs around 42% of preassignment.

It’s called a phase transition because of the closerelation to state transition phenomena studied inphysics, such as the melting of a solid into aliquid.

166

Exploiting Structure

167

Forward Checking Arc Consistency on binary constraints

Exploiting Structure in QCP

168

Arc Consistency on Binary Constraints

Further Exploiting Structure in QCP

Shaw, Stergiou and Walsh - ECAI98

General Arc Consistency on all different

constraints

169

Enforcing General Arc Consistency on All Different Constraints

• Beautiful example of integration of AI/OR techniques for a well defined sub-problem

• Propagation uses Maximum Matching problem (particular case of Network Flow problems which have polynomial time complexity)

Regin - AAAI94

170

Further Exploiting Structure in QCP

By enforcing general arc consistency on all different constraints problems up to order 50 could be solved!

Shaw, Stergiou and Walsh - ECAI98 Regin - AAAI94

171

Stochasticity in Search ProceduresStochasticity in Search Procedures

172

Background

Stochastic strategies have been very successful in the area of local search.

Limitation: inherent incomplete nature of local search methods.

We want to explore the addition of a We want to explore the addition of a stochastic element to a systematic search stochastic element to a systematic search procedure without losing completeness.procedure without losing completeness.

173

We introduce stochasticity in a backtrack search method by randomlybreaking ties in variable and/or valueselection.

Compare with standard lexicographictie-breaking.

174

Randomized Strategies

Strategy Variable sel. Value sel.

DD deterministic deterministic

DR deterministic random

RD random deterministic

RR random random

175

176

177

178

Lesson: Randomized tie-breaking can improve performance over a purely deterministic strategy.

Next: But we can obtain a more dramatic advantage from randomization ...

179

Cost Distributions

Key Properties:

I Erratic behavior of mean.I Erratic behavior of mean.

II Distributions have “II Distributions have “heavy tailsheavy tails”. ”.

180

Median = 1!

samplemean

number of runs

3500!

500

2000

181

1

182

75%<=30

Number backtracks Number backtracksProp

ortio

n of

cas

es S

olve

d

5%>100000

183

Heavy-Tailed Distributions

… … infinite variance … infinite meaninfinite variance … infinite mean

Introduced by Pareto in the 1920’s--- “probabilistic curiosity.”

Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena.

Examples: stock-market, earth-quakes, weather,...

184

Decay of Distributions

Standard --- Exponential Decay e.g. Normal:

Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy:

Pr[ ] , ,X x Ce x for someC x 2 0 1

Pr[ ] ,X x Cx x 0

185Standard Distribution

(finite mean & variance)

Power Law Decay

Exponential Decay

186

Normal, Cauchy, and Levy

Normal - Exponential Decay

Cauchy -Power law DecayLevy -Power law Decay

187

Tail Probabilities (Standard Normal, Cauchy, Levy)

c Normal Cauchy Levy0 0.5 0.5 11 0.1587 0.25 0.68272 0.0228 0.1476 0.52053 0.001347 0.1024 0.43634 0.00003167 0.078 0.3829

188

How to Check for “Heavy Tails”?

Log-Log plot of tail of distributionshould be approximately linear.

Slope gives value of infinite mean and infinite varianceinfinite mean and infinite variance

infinite varianceinfinite variance

1

1 2

189

Example of Heavy Tailed Model(Random Walk)

Random Walk:•Start at position 0•Toss a fair coin:

• with each head take a step up (+1)• with each tail take a step down (-1)

X --- number of steps the random walk takes to return to position 0.

190

The record of 10,000 tosses of an ideal coin(Feller)

Zero crossing Long periods without zero crossing

191

Random Walk

Heavy-tails vs. Non-Heavy-Tails

Normal(2,1000000)

Normal(2,1)

O,1%>200000

50%

2

Median=2

1-F(

x)U

nsol

ved

frac

tion

X - number of steps the walk takes to return to zero (log scale)

192

466.0

319.0

153.0

Number backtracks (log)

1-F(

x)U

nsol

ved

frac

tion

1 => Infinite mean

Heavy-tails in QCP Domain

193

The Log-Log plot shows a linear relationover many orders of magnitude. This isclear evidence of heavy-tailed behavior.

194

195

196

Heavy Tailed Cost Distribution

0.1

1

1 10 100 1000 10000 100000

log( Backtracks )

log(

1 -

F(x)

)

197

The Log-Log plot shows a linear relationover many orders of magnitude. This isclear evidence of heavy-tailed behavior.

198

By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution.

Right-hand side: non-negligible fraction of very long runsLeft-hand side: non-negligible fraction of very short runs

199

70%>250000

15!

1%<=650!

Sports Scheduling

Number backtracks (log)

Cum

ulat

ive

Dis

trib

utio

n Fu

nctio

n

200Standard Distribution

(finite mean & variance)

Power Law Decay

Exponential Decay

Also, heavy tails on left. (High probability of very short runs.)

201

Consequence for algorithm design:

Use rapid restarts or parallel / inter-leaved runs

Super linear speedups!!!

202

X XX XX

solved10 101010 10

Sequential: 50 +1 = 51 secondsParallel: 10 machines --- 1 second 51 x speedup

Super-linear Speedups

Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup

203

Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution.

RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left.

Restarts provably eliminate heavy tails (Gomes, Selman & Crato )

204

Sketch of proof of elimination of heavy tails

Let’s truncate the search procedure after m backtracks.

Probability of solving problem with truncated version:

Run the truncated procedure and restart it repeatedly.

pm X m Pr[ ]

X numberof backtracks to solve the problem

205

Y total number backtracks with restarts

F Y y pmY m c e c y

Pr[ ] ( ) /1 1

2

Number of starts Y m Geometric pmRe / ~ ( )

Y - does not have Heavy Tails

206

Restarts

70%unsolved

250~ 62.5 restarts

1-F(

x)U

nsol

ved

frac

tion

Number backtracks (log)

207

Example of Rapid Restart Speedup(planning)

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( bac

ktra

cks

)

20

2000 ~100 restarts

Cutoff (log)

Num

ber

back

trac

ks (l

og)

~10 restarts100000

208

Deterministic

Logistics Planning 108 mins. 95 sec.Scheduling 14 411 sec 250 sec

(*) not found after 2 days

Scheduling 16 ---(*) 1.4 hoursScheduling 18 ---(*) ~18 hrsCircuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.

Summary Results

R3

209

Our results provide the first indication of heavy-tailed distri-butions in a computational model.

Overall insight:Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy.

210

Heavy-Tailed Distributionsin Other Domains

Quasigroup Completion Problem

Graph Coloring

Logistic Planning

Circuit Synthesis

Gomes, Selman, and Crato 1997; Gomes, Selman, McAloon, and Tretkoff 1998; Gomes,Kautz, and Selman 1998;

211

Deterministic

Logistics Planning 108 mins. 95 sec.Scheduling 14 411 sec 250 sec

(*) not found after 2 days

Scheduling 16 ---(*) 1.4 hoursScheduling 18 ---(*) ~18 hrsCircuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.

Summary Results

R3

212

Rapid Restart Speedup

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( bac

ktra

cks

)

213

Our results provide the first indication of heavy-tailed distri-butions in a computational model.

Overall insight:Overall insight:Randomized tie-breaking withrapid restarts gives powerfulsearch strategy.

214

Heavy-Tailed Distributionsin Other Domains

Quasigroup Completion Problem

Graph Coloring

Logistic Planning

Circuit Synthesis

Gomes, Selman, and Crato 1997 - Proc. CP97; Gomes, Selman, McAloon, and Tretkoff 1998 - Proc AIPS98; Gomes,Kautz, and Selman 1998 - Proc. AAAI98.

215

Algorithm Portfolio Design

Gomes and Selman 1997 - Proc. UAI-97; Gomes, Selman, and Crato 1997 - Proc. CP97.

216

Motivation

The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances.

Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths.

217

Branch & Bound:Best Bound vs. Depth First Search

218

Branch & Bound(Randomized)

Standard OR approach for solving Mixed Integer Programs (MIPs)• Solve linear relaxation of MIP• Branch on the integer variables for which the solution of the LP relaxation is non-integer:

apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value)

• Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values

219

Branch & BoundDepth First vs. Best bound

Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound

(standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic

Depth-first - often quickly reaches an integer solution(may take longer to produce an overall optimal value)

220

Portfolio of Algorithms

A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.Goal: to improve on the performance of the component algorithms in terms of:

expected computational cost“risk” (variance)

Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.

221

Depth-first vs. Best-bound(logistics planning)

Number of nodes

Cum

ulat

ive

Freq

uenc

ies

Depth-First~50%

Best-Bound

~30%

222

Depth-First and Best and Bound do not dominate each other overall.

223

Heavy-tailed behavior of Depth-first

224

Portfolio for heavy-tailed search procedures (2 processors)

0 DF / 2 BB

2 DF / 0 BB

Standard deviation of run time of portfolios

Expe

cted

run

tim

e of

por

tfol

ios

225

Portfolio for heavy-tailed search procedures (6 processors)0 DF / 6 BB

6 DF / 0BB

Standard deviation of run time of portfoliosExpe

cted

run

tim

e of

por

tfol

ios

5 DF / 1BB

3 DF / 3 BB4 DF / 2 BB

Efficient set

226

Portfolio for heavy-tailed search procedures (20 processors)

0 DF / 20 BB

20 DF / 0 BB

Standard deviation of run time of portfolios

Expe

cted

run

tim

e of

por

tfol

ios

227

Portfolio for heavy-tailed search procedures (2-20 processors)

228

A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena.

229

Summary of Randomization Considered randomized backtrack search.

Showed Heavy-Tailed Distributions.

Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and

scheduling problems.

Portfolio of Algorithms for cases where no single heuristic dominates

230

Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions.

Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs

Experimentally validated on previously unsolved planning and scheduling problems.

Portfolio of Algorithms for cases where no single heuristic dominates

231

IV. CONCLUSIONS

232

Important Themes in OR

Linear Programming (Mixed) Integer Programming

Exploit Structure e.g., Network Flow Problems

Duality very elegant theory in LP sensitivity analysis

233

Opportunities for Integration of AI/OR

OR methods:Have focused on tractable representations (LP)Have demonstrated the ability to identify optimal and locally optimal solutionsLIMITATION: Restricted to rigid models with limited expressive powerAI methods:Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop.LIMITATION: Rich representations in general lead to intractable problemsCHALLENGE: good representations / fast & good solutions

,

234

Opportunities for Integration of AI/OR

AI methods are becoming competitive

AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples:

SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches.ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX.OR people have acknowledge the benefits of combining OR and AI methods

235

Opportunities for Integration of AI/OR

Exploiting Duality in CSP frameworks

Exploiting Randomization

Hybrid Solvers

236

Opportunities for Integration of AI/OR

Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX

local constraint propagation - local consistency algs global constraint propagation - LP relaxations

Only a hybrid approach could prove optimality, e.g.:Hoist scheduling (Rodosek & Wallace 1998)Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998)

237

Updated version of tutorial slides

www.cs.cornell.edu/gomes/

TalksDemos

238

Appendix

239

Portfolio of Algorithms

A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.

A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost.

The standard deviation of the portfolio is a measure of the risk inherent to the portfolio.

240

Portfolio of Algorithms

Goal: to improve on the performance of the component algorithms in terms of:

expected computational cost“risk” (variance)

Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.

241

AppendixPortfolio of Algorithms

Goal: to improve on the performance of the component algorithms in terms of:

expected computational cost;risk;

Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk.

Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk.

242

Appendix Portfolio of Two Algorithms

Let us consider the random variables:

A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist;

A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist;

243

Appendix Portfolio of Two Algorithms

Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2).

Let us consider the random variable:

X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist;

244

AppendixPortfolio of Two Algorithms

Given N processors, and

P[X x]Nii 1

NP[A1 x]i P[A1 x](N i)

n N1 n2 0

245

Appendix Portfolio of Algorithms

Given N processors, such that and n1n N n2 1 ,

0 1 n N

P[X x]n1

i'i' 0

n1P[A1 x]i

'P[A1 x](n1 i' )

i

N

1

n2

i' 'P[A2 x]i

' 'P[A2 x](n2 i' ' )

i i i' ' ' and the term in the summation is 0 when 2'',0'' nii

246

Preliminary Research on Structure of Search Spaces

247

Fringe of Search Tree

248

Fractal Dimension

249

Fractal Dimension When plotting the length of a curve as a function of the measuring tool on a log-log plot, one obtains a

linear relationship:

L - the measured length; s - length of the yardstick;

c and d are constants;

Mandelbrot introduced the fractal dimension D = d +1; A straight line has D = 1.0; The coast of Britain has fractal dimension 1.22;

The higher D the more fractal the curve is.

dscL )/1(

250

Heavy-Tailed Behavior vs Non-heavy-tailed behavior

top related