integration of artificial intelligence and operations research for
TRANSCRIPT
1
Integration of Artificial Intelligence and Operations Research Techniques
for Combinatorial Problems
Carla P. GomesCornell University
Ken McAloon and Carol TretkoffILOG
{mcaloon,tretkoff}@ilog.com
2
AI, OR, and CS
AI OR
CS
3
Integration of Artificial Intelligence &Operations Research
TechniquesAI
RepresentationsConstraint Languages
Logic FormalismsObject-Oriented Prog.
Bayesian NetsRule Based Systems
• • •Tools
Constraint PropagationSystematic SearchStochastic Search
• • •Pros / Cons
Rich RepresentationsComputational Complexity
ORRepresentations
MathematicalModeling LanguagesLinear & Non-linear
(In)Equalities• • •Tools
Linear ProgrammingMixed-Integer Prog.Non-linear Models
• • •Pros / Cons
More Tractable (LP)Primarily Complete InfoLimited Representations
Combinatorial Problems
Planning Scheduling
THE CHALLENGEAI OR
UNIFY APPROACHES TO:
SCALE UP SOLUTIONSHANDLE UNCERTAINTYANALYZE COMPLEXITY
(phase transition)FRAGILE
EXPLOIT PROBLEM STRUCTURE
INCREASE ROBUSTNESS
31 - 45: ACPOWER? 0 NUM-UNAV-RESS 1UNAV-RES-MAP (DIV2 D24BUS-3 D24-2 D24-1) (ACPLOSS D24BUS-3 D24-2
ROME LABORATORY OUTAGE MANAGER (ROMAN)
Parameters Load Run Gantt Charts Utilities Parameters Load Run Gantt Charts Utilities
AC-POWER StatusAC PowerDIV1DIV2DIV3DIV4
0 10 20 30 40 50 60 70 80 90
GoalStart
4
Outline
I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers
III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions
5
I. Short OR Overview
6
Outline for Linear Programming and Integer Programming
• Standard Form of LP and a Simple Example • Geometric Interpretation of LP• Complexity issues• MIP• Example: Fast Food• Example: Capacitated Warehouse• Example: 911
7
Outline
1. Short Overview of OR
2. Constraint Programming3. Cooperating Solvers
4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard
Combinatorial Problems 6. Conclusions
8
Optimization Technology Evolution
Dispatch Rules
1960 1970 1980 1990
SA, GA, Tabu
CPMPERT
Constraint-based Scheduling
19981947
Primal Simplex LP
ParallelLP/MIP
ConcurrentScheduling
Interior Point
ConstraintPropagation
Large IPsMIP
ShiftingBottleneck
First CP Systems
CooperatingSolvers (LP/CP)
Global constraints
Barrier LPBarrier Crossover
Dual Simplex Implementation
Dual Simplex
9
1. Short OR Overview
10
Outline for Linear Programming and Integer Programming
• Standard Form of LP and a Simple Example • Geometric Interpretation of LP• Complexity issues• MIP• Example: Fast Food• Example: Capacitated Warehouse• Example: 911
11
An LP Story
A factory can produce n products from m partsFor product j it needs aij units of part iThere are bi units of part i availableEach unit of product j sold earns cj
Amount of each product to make is unknown xj 0Each part i determines a constraintai1 x1 + … + ain xn bi
Obvious solution: do nothing
Better: maximize c1 x1 + … + cn xn
12
Standard Forms of LP
A linear program (LP) in standard form (Dantzig 1947)
max cTx subject to Ax b x 0
Input data: c (n x 1), A (m x n), b (m x 1).Variables: x (n x 1)
13
Standard Forms of LP
// The objective functionmax c1 x1 + … + cn xn
// The constraintssubject to
a11 x1 + … + a1n xn b1
...am1 x1 + … + amn xn bm
x1 0 , … , xn 0
14
Standard Forms of LP
• In OR emphasis is on optimality
• Solution means optimal solution
• Feasible solution means solution in the ordinary sense
15
Standard Forms of LP
Interpretation of standard form:• xj = amount of product j to make• cj = revenue per unit product j• bi = available amount of component i• aij = units of i used per unit of j produced
The constraints “say”: aijxj = units of i used by j= units of i used bi
16
What are models?A model is a data-independent abstraction of a problemA model lets you write down the mathematical representation of a
model independently of the data
ProjectModel Data
OneProblemInstance
17
Products Could be Jewelry
Products: Rings and EarringsComponents: Gold and Diamonds
One ring requires 3 units of Gold, and 1 DiamondOne set of earrings requires 2 units of Gold, and 2 Diamonds
Total Gold and Diamonds are limited
Profit is different for Rings than for Earrings
Products = { rings, earrings };Components = { Gold, Diamonds };
demand = [ [3, 1], [2, 2] ];
stock = [150, 180];
profit = [60, 40];
18
Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl
Components: Nitrogen, Hydrogen, Chlorine
One unit of Gas requires 1 unit of Nitrogen, 3 units HydrogenOne unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit
of Chlorine
Total Nitrogen, Hydrogen, Chlorine is limited
Profit is different for Gas than Chloride
Products Could be Chemicals
Products = { gas, chloride };
Components = { nitrogen, hydrogen, chlorine };
demand = [ [1, 3, 0], [1, 4, 1] ];
stock = [50, 180, 40];
profit = [30, 40];
19
The Problems Have One Model
enum Products ...;enum Components ...;
float+ demand[Products, Components] = ...;float+ profit[Products] = ...;float+ stock[Components] = ...;
var float+ production[Products];
maximize sum (p in Products) profit[p] * production[p]
subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c]};
Data
DecisionVariables
Objective Function
Constraints
20
OR Modeling Systems
• OPL• AMPL• 2LP• AIMMS• GAMS• MPL• ILOG Planner• etc
21
The Dual
The dual linear program (von Neumann 1947);
min yTbsubject to yTA c y 0
Variables y (m x 1)
Awesome Symmetry - The dual of the dual is the primal
22
Rows and Columns Exchanged
min b1 y1 + … + bm yn
subject to
a11 y1 + … + am1 ym c1
...a1n y1 + … + amn ym cn
y1 0 , … , ym 0
23
Duality Theorem
Theorem: min yTb = max cTx
• Consequence: This turns optimality problem into a feasibility problem in x and yAx b x 0yTA cT
y 0 yTb = cTx
• Consequence: Enumeration not needed to verify optimality
24
Duality Theorem
• Sensitivity Analysis• Consequence: The solution values y* for the y variables
yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b
yi * = Z / bi where Z is the optimum
Reference: McAloon and Tretkoff [1996] Wiley
Duality
Two different views of the same phenomenon
Point vs Set
Arc vs Node
Momentum vs Position
Vector vs Hyperplane
Landlord vs Renter
26
Simplex and Barrier
• The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function
• The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope
27
Geometric Interpretation of LP
X
Y
Max: Xsubject to:
-X + Y <= 4X + 4*y <= 362*X + y <= 23X + Y >= 4Y >= X + 10
(0,4)
(4,0) (8,0)
(10,3)
(4,8)
Barrier
Simplex
28
Complexity of Linear Programming
Simplex MethodWorst-case --- exponential (Klee and Minty 72)Practice --- good performance
Ellipsoid MethodKhachian’s Ellipsoid Method Worst-case --- polynomialPractice --- poor performance
29
Complexity of Linear Programming
Interior Point Methods or Barrier Methods“Karmarkar’s” (and variants) Method
Worst-case --- polynomialPractice --- good performance
30
Complexity of Linear Programming
• Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice.
• Which method performs best is problem dependent.
31
Success Stories
• Industrial PlanningGiven current resources, decide what to produce in what quantity
• Supply Chain ManagementMultiperiod planning models that link flow from one period to the next
• Network FlowHow best to route goods across a network
32
Assumptions of Linear Programming
• Linearitywhen violated: ( xy = 50)Nonlinear programming
• Continuity when violated: (x integral)(Mixed) Integer programming
33
Assumptions of Linear Programming - continued
• No Disjunctive Constraintswhen violated: (x 100 or x 0)Disjunctive programmingAdditional 0-1 variables and Big M constraints
• Certaintywhen violated: (cost c is a random variable)Stochastic programming
34
Search and MIP
• In order to deal with variables that must have integer values in the solution, a search must be performed.
• Mixed Integer Programming problems are combinatorial optimization problems and are NP hard
• feasibility is NP-Complete• verifying optimality is co-NP-Complete
35
MIP and Combinatorial Optimization
• These problems have been attacked by both the AI and OR communities.
• In AI, these problems are attacked as CSPs or as Planning Problems.
• In OR, they are done as MIPs and use linear relaxation to help guide the search.
• The overriding idea in each case is to limit search.
36
Integer Program: All Integer Points in Region
37
Cut to Create Integer Vertex
Integer Vertex
38
Example - Fast Food
• Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet?
• If so, what is the least he can do it for?
39
Nutritional Requirements
• At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron
• At least 55 grams of protein• At most 3000 milligrams of sodium• At most 30% of the calories can come from fat
• Nutritional information is available from fast food outlets
40
College Student’s Requirements
• At least 2000 calories a day• No more than 3 servings of any one food• Milk only with cereal and not as a stand-alone
drink
41
Fast Food - MIP Model
• We will have variables Servk to represent the number of servings of item k in the plan.
• The variable Servk will have to take an integer value for the solution to be valid.
• The objective function: Z for cost
42
Fast Food - MIP Model
• Let foodk,j represent the percent of RDA of nutrient j in a serving of item k
• The for each nutrient j, we have a constraint
foodk,j Servk 100 k
43
Fast Food - MIP Model
• Let sodiumk represent the amount of salt in a serving of item k
• For salt we have the constraint sodiumk Servk 3000 k
• Similarly for fat
44
Fast Food - MIP Model
• Let costk represent the cost of a serving of item k
• For the objective function we have the defining constraint
costk Servk = Z k
45
Fast Food - Solution
• With a MIP solver and a way to input these constraints we ask for
• a solution that makes the variables Servk integral
• and which minimizes Z
46
MIP Solution Technique
• What the MIP solver does is to carry out a branch and bound search guided by
• the linear relaxation– the solution to the problem with the integrality
requirements relaxed
• Initialize the global variable best_so_far to 1000 (or something else very big).
47
At a Node
• Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if
z* best_so_far ,
• If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node.
48
Branching at a node
• Choose a variable Servk whose value s* is not integral.
• Typical heuristic: most non-integral variable
• Create two child nodes,• add Servk floor(s*) • add Servk ceil(s*)
49
Good News
• The linear relaxation can prune nodes before all variables Servk are forced to be integral.
• Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why
• A solution to the LP is at a vertex• A vertex is defined as the simultaneous
solution of the equality form of n linearly independent constraints
• Many of these constraints are integer bounding constraints yielding X = integer
50
Arboreally Speaking
• Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A*
• If the linear relaxation is tight
| z*linear - z*integral | is relatively small
then z*linear is an excellent evaluation function
51
Answer - Fast Food
Total cost is 8.71
Buy 3 burgersBuy 2 friesBuy 3 honeysBuy 1 yogurt...
52
Example - Fixed Cost
• Warehouses must be rented in order to supply stores and we must decide which to use
• For each store j we know its monthly demand dj
• For each warehouse i we know its capacity ki
• For each warehouse i we know the fixed cost to run it each month fci
• For each pair i, j we know the monthly cost cij of supplying j from i
53
Example - Fixed Cost
• Xij is the fraction of store j’s demand met by i• Xij 1
• Yi is a “fuzzy” boolean• it will be 1 if the warehouse is rented• 0 if it is not rented
• Yi 1
54
Example - Fixed Cost
• Each store must be supplied X ij = 1 i
• Warehouse capacity can not be exceeded dj Xij ki j
• Tighter dj Xij ki Yi j
55
Example - Fixed Cost
• Objective function
fci Yi + cij Xij
• This yields a MIP with 0-1 variables Yi
56
Branch and Cut: An Enhanced Solution Method
• Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation
Xij Yi
• Add at a node if violated by solution to linear relaxation
• Powerful method - will solve the Imperial College OR lib CW problems very easily
57
Example - Call 911
• PCTs answer the phone 24 hours a day, 7 days a week.
• It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate.
• Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours.
58
Example - Call 911
• Each PCT has a work week of 5 days followed by 2 days off.
• Want to meet the demand with minimal or near-minimal number of PCTs.
• So need to determine how many PCTs start their work week at each hour h of the week
59
Modeling 911
• A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0 h < 168.
60
Modeling 911
• A continuous variable Z will represent the objective function
Pcth = Z h
• There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed.
61
Modeling 911
• For this constraint we need to represent the number of workers who are on duty at time h
• Certainly, those who start the week at time h are here, as are those who started the week at time h - 1
• And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break.
62
Modeling 911
• This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h
Pctw bh w
63
Call 911 solved with progressive roundoff
int b[168] = { // New York City 91130,24,18,15,14,14,15,25,34,36,38,40,41,43,46,57,57,59,61,59,55,50,45,38,32,25,20,17,15,13,17,25,32,35,38,40,42,43,47,58,57,57,59,57,55,52,47,41,33,25,20,17,15,13,15,25,32,33,37,39,42,43,47,57,56,57,57,56,53,50,47,41,34,27,22,19,16,15,16,25,31,35,37,40,44,45,48,57,57,56,58,56,53,53,46,41,34,28,23,19,16,15,17,25,33,37,39,42,45,47,51,59,58,60,61,61,57,56,57,55,48,41,35,30,26,20,18,22,26,32,42,46,49,53,54,56,56,56,59,59,57,57,56,56,52,46,41,34,29,23,18,19,25,31,36,41,46,50,52,53,52,53,54,53,50,49,45,40
};
64
Modeling 911
• Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small.
• The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer.
• The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers.
65
Modeling 911
• For this and many other applications, heuristics can be used to develop good solutions
• Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc.
66
Solving the Integer Problem
main() // Planner Code{IlcInitFloat();IlcManager m(IlcNoEdit);IlcLinOpt simplex(m);IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168);int i,j,k,h,n;
67
Solving the Integer Problem
// Pctw bh w
for(h=0;h<168;h++) { // for each hour of 168 in weekfor(j=0;j<168;j++)coeffs[j] = 0;for(k=0;k<5;k++) // for each of 5 daysfor(j=k*24;j<k*24+8;j++) // for each of 8if (j!=(k*24+4)) // hourscoeffs[(h+168-j)%168] = 1;simplex.add(IlcScalProd(coeffs,Pct) >= b[h]);}
68
Solving the Integer Linear Problem
IlcFloatVar Z = IlcSum(Pct);// Objectivesimplex.setObjMin(Z);for(i=0;i<168;i++) { //Progressive roundoffn =ceil(simplex.getCurrentValue(Pct[i]));// Fix variable and re-optimizesimplex.add(Pct[i] == n);}m.out() << “Number of Pcts needed is “ << Z << endl;m.end();
}
69
Solution
• This code finds a solution with 208 workers in a couple of seconds. The optimum is 207.
• The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution
• [Bartholdi,Ratliffe,Orlin]
70
2. Constraint Programming
LP/MIP is Beautiful, except when
• Variable domain information is important to the search strategy
• especially critical in scheduling• The problem variables range over symbolic
entities and there are lots of symmetries• timetabling
• The MIP representation can be too verbose or awkward
• configuration• There are just too many constraints
e.g. vehicle routing
72
Mathematical Basis of Constraint Programming (CP)
The Constraint Satisfaction Problem:• Suppose a finite set of variables is given
and with each variable is associated a non-empty finite domain.
• A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk) D1 x …x Dk.
• A constraint satisfaction problem (CSP) is given by a finite set of constraints.
• A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied.
73
Domain Reduction
• In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints.
• With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint.
• Accelerates convergence toward a solution
• Detects infeasibility
74
Constraint Propagation
• The other key issue is communication among the constraints or subproblems.
• The basic method used is called constraint propagation which links the constraints through their shared variables.
• The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints.
Monsieur Jordan Phenomenon
• Like prose, you have been doing constraint propagation all your life.
– Crossword puzzles
• Incomplete and so backtracking is needed– NY Times Sunday Crossword– Optical Illusions
• Origin: Vision analysis (Marr,Waltz et al)
76
Strengths of Constraint Programming
Constraint Programming provides a rich Rich
• Rich representation language.• CP variables naturally represent problem
entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT.
• Opportunity to choose a good heuristic for the solution strategy.
77
Which Method for Which App?
ProductMix
LP
ProductionPlanning
MIP
DistributionPlanning
MIP
Scheduling
ConstraintBased
Scheduling
Dispatching
CPLocal search
Configuration
CP Technology
Application
Linear => Disjunctive Constraints
Strategic => Operational Optimization
78
3. Cooperating Solvers
First Stop
CP/CP
80
Mother of All Examples - N Queens
• Do we think in terms of queens • Where do we place this queen ?
• Do we think in terms of squares • Will this square contain a queen ?
• These views are dual to one other
The Primal View
For each queen assign it a square
Place this queen in this square ?
The Dual ViewFor each square decide whether it will
have a queen
Place a queen in this square ?
83
The Primal Model
In which row do we place q[j] - the queen in column j
The constraints q[i] != q[j]q[i] - q[j] != i - jq[i] - q[j] != j - i
Note: no alldifferent constraint
84
Yet Another duality - rows vs columns
In which column do we place qq[i] the queen in row i
The constraints are the sameqq[i] != qq[j]qq[i] - qq[j] != i - jqq[i] - qq[j] != j - i
85
The Relationship
Can link them as inverse functions:
q[qq[i]] = iqq[q[j]] = j
The constraint propagation
i leaves domain of q[j] iff j leaves domain of qq[i]
86
In this primal/dual model
• Apply first-fail to q[i]
• Lo and behold one-third fewer fails
(Example from Jean Jordan’s thesis)
87
88
Q
Q
X
X
X XX X X
X
X
X
X
X
X
X
X
89
Q
Q
X
X
X XX X X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X XX
X
X
X
90
Q
Q
X
X
X XX X X
X
X
X
X
X
X
X
X
Q
X
X
X
X
X XX
X
X
X
Q
X X
X
X
91
So …
The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint
The arc consistency of this global constraint is non-trivial to maintain
Network flow algorithmsflow goes from values to variableseach variable has unit demand and capacity
92
Remarks
• An IP model will encode the first dual solution
Will this square contain a queenx[i][j] = 0 or x[i][j] = 1
• A disaster beyond 30 queens• network structure on rows and columns lost
• Another example - sports scheduling
93
Constraints and Indices
• In IP symbols are represented by indices as opposed to values for variables.
• nurses, teams • Paris-St-Germain plays Manchester United on
day kxijk {0-1} to represent “team i plays team j on day
k”
• You can’t put symmetry breaking and other constraints on indices.
Second Stop
CP/IP
95
CP Is Powerful, But ….
• Sometimes, inconsistencies can be overlookedX - Y 12X + Y 10X in [1..20]Y in [1..20]
• Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution
• but an LP solver would spot this
96
2 Dimensional Bin Packing
• Application for the Automobile Industry built by Greg Glockner
97
2 Dimensional Bin Packing
• The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed.
The actual application involves circuit boards
• There are two complete models, one a CP model and the other an IP model.
The CP model directs the search The LP relaxation prunes the search space by
detecting infeasible nodes
98
2-D Bin Packing
Arrange circuit boards onto raw materialBoards may be rotatedUse same number of each board
Objective: minimize scrap
Classic combinatorial optimization problem
99
Solving 2-D Bin Packing
Use CP to generate partial solutions (nodes)Restrict placement to reduce fragmentation of
blank spaceUse tight LP to test feasibility
If any partial solution is infeasible in the LP, prune the tree immediately
CP constraints reduce the tree widthLP allows us to prune quickly
100
2 Dimensional Bin Packing
• As the search tree is traversed, the two models are in sync.
• Note that the variables used in the 2 models are disjoint
• The two models are dual to each other• The IP sees the model from the point of view
of the board, the large rectangle• The CP sees the model from the point of view
of the small rectangles• Solutions are obtained in minutes
101
2DBP: Basic CP Formulation• Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile
• Basic constraints:– Disjunctive constraints to prevent overlapping tiles
xi + wi xj yi + hi yj
xj + wj xi yj + hj yi
– Constraints to count the number of each tile type
Tile-oriented formulation
102
2DBP: Basic IP Formulation
Let xijnt = 1 if tile n of type t is in position (i, j)The constraints are:
tnjix
jix
tnx
ijnt
hjjwiijjiitnji
ntji
jiijnt
tt
,,,}1,0{
,1
,1
,,,:,,,
,
Grid-oriented formulation
103
2DBP: LP Issues
• The LP is large• The LP exhibits significant primal degeneracy• The LP exhibits significant dual degeneracy
104
2DBP: LP Issues
• The simplex algorithm cannot solve the LP• There is no way for a MIP solver to solve the IP
as such• The barrier method can solve the LP
105
2DBP: Summary
• CP as master problem– Orders tiles – Places tiles by position, then type– Selects tile type by frequency to scatter
tiles throughout the bin– Uses a one-ply lookahead constraint to
limit the position of following tile• LP relaxation prunes the CP search space
– Checks whether the partial solution will lead to an infeasible instance
• Use idiomatic formulations for CP and IP
106
2DBP: Remarks
• The CP fixes significant numbers of variables at each node
• The LP pre-processor greatly simplifies the LP
• Therefore, the lack of incrementality of the barrier method does not cost us
107
2DBP: Cooperative Algorithm Demo
108
Last Stop
• Constraint Programming and Local Search cooperation
• Another example of duality in action
109
CP/LS
Parallel machines with set-up times
Ready timesDues datesSplittable jobsRogue machines
Objectivesmeet due datesminimize setup costs
110
Two Phase Cooperation
Phase I - the Primal (Work on first objective)
• Configure and schedule the jobs
– Use constraint based scheduling
111
Two Phase CooperationMachines morph into trucks
112
Two Phase Cooperation
• Phase 2 - the Dual (Work on second objective)
• Schedule the trucks
– Use Lin, Lin-Kernighan, tabu etc
113
Parallel Machines: Cooperative Algorithm Demo
114
IC Park Example
• Hoist Scheduling (Rodosek and Wallace)
• The original model is an IP
• The CP model is “the same”
• CP guides search, LP relaxation and CP share pruning duties
• No apparent duality
115
Remarks
• One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms
• IP/LS is just around the corner
• IP/IP cooperation is hard because one can’t formulate truly dual views
– either simply not there– or too verbose
counterexamples welcome
116
4. DISJUNCTIVE PROGRAMMING
117
Disjunctive Linear Programming
• An extension of Mixed Integer Programming
• A union of polyhedral sets (feasible regions) is called a disjunctive set.
118
Disjunctive Set
119
Disjunctive Linear Programming
• The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem.
• The solution set of the disjunctive programming problem is
Fij
i<M j<N
120
Disjunctive Linear Programming Examples
• Semi-continuous variableseither X >= 100or X == 0
• Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable
121
Solution Set Inside Initial Region
122
Disjunctive Linear Programming Examples - continued
• Bollapragada, Ghattas and HookerTruss structure design problem
Branches directly on alternatives dictated by Hooke’s Law
• WyattDisjunctive programming and mean absolute
deviation models (MAD) for portfolio optimization
Extends Bender’s decomposition to disjunctive linear programs
123
Disjunctive Linear Programming continued
• Balas, Cornuejols and CeriaGenerating cuts for disjunctive programming
problems.
• McAloon and TretkoffBasic mathematical results: Optimization and Computational Logic, Wiley
124
Disjunctive Linear Programming continued
• Dealing with the disjunctive part requires search.
• This requires an engine which is not available in MIP packages
• Also the linear relaxation is not as tight and the evaluation function is not as faithful
• The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers
• Beringer and DeBacker for MIP
125
To Keep It Simple
GERALD + DONALD = ROBERT
An AI classicNewell and Simon
Assignment problem + 1 constraint
Surprisingly hard for MIP solversCPLEX MIP takes 1 minute and 29048 nodes (on
Sun Enterprise) to find a feasible integer solution
126
The Disjunctive Program
• One constraint for the equation • 100000 G + … + D = 100000 R + … + T
• For each variable X among G,…,T• X = 0 or X = 1 or … or X = 9
• For each pair X, Y• X Y-1 or Y X-1
127
Solution Set: SOME of the Integer Points in the Region
128
The Twin Variables for Cooperating Solvers
• Integer variables for the letters 0 g, e, r, a, l, d, o, n, b, t 9
• With continuous doppelgangers0 G, E, R, A, L, D, O, N, B, T 9
129
The Variables
• One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver
100000 G + 10000 E + 1000 R + … + D +100000 D + 10000 O + 1000 N + … + D =100000 R + 10000 O + 1000 B + … + T
130
The Variables
• One CSP constraint on the integer variables posted to a discrete constraint propagation engine
AllDifferent(g, e, r, a, l, d, r, n, b, t )
131
The Search
• Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely
• The branching strategy is guided by the linear relaxation on the continuous variables
• if there is a non-integral variable X, branch on it
X floor(X*) or
X ceil(X*)
132
The Search
• If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack
• If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack
133
The Search
• New wrinkle• The solution to the linear relaxation might
have all variables integral - but the AllDifferent constraint can be violated by this set of values
• In this case, branch to keep them apart• either X Y - 1• or Y X - 1
134
The Variables
void main(){
IlcInitFloat(); IlcManager m(IlcNoEdit);
IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9),G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9);
IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T);
// Continued on next slide
135
The Constraints
m.add(IlcAllDiff(vars,IlcWhenValue));
IlcLinOpt simplex(m); simplex.add(
100000*R + 10000*O + 1000*B + 100*E + 10*R + T==
100000*G + 10000*E + 1000*R + 100*A + 10*L + D +
100000*D + 10000*O + 1000*N + 100*A + 10*L + D ,
IlcTrue // Post to Solver as well);
136
The Search for solutions
m.add(Generate(m,simplex,vars)); // Search strategy
if (m.nextSolution()) { // Find a solutionm.out() << " solution found " << endl;;
}m.printInformation();m.end();
}
137
Branch if a variable is non-integer
ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex);
if (varIndex >= 0) // There is a non-integer variablereturn IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex),
this);
138
Is integer relaxation a solution ?
IlcManager m = getManager();if(m.solve(TestIntegerRelaxation(m,simplex)))
return 0;
139
Find two variables with same value
IlcInt j;for(i=0;i<vars.getSize()-1;i++) {
if (vars[i].isBound()) continue; // Can’t both be boundIlcInt n = simplex.nearest(simplex.getCurrentValue(vars[i]));for(j=i+1;j<vars.getSize();j++) {
IlcInt m =
simplex.nearest(simplex.getCurrentValue(vars[j]));if (m == n) break;
} if (j< vars.getSize()) break;
}
140
Branch to push them apart
// j and i are the indices of two variables with same current value
return IlcAnd(
IlcOr(Smaller(m,vars[i],vars[j],simplex),Smaller(m,vars[j],vars[i],simplex)),
this // Recursion);
}
141
Pushing two variables apart
ILCGOAL3(Smaller,IlcIntVar,x,IlcIntVar,y,IlcSimplex,simplex){
simplex.add(x <= y-1,IlcTrue);return 0;
}
142
Testing the integer relaxation
ILCGOAL1(TestIntegerRelaxation, IlcSimplex, simplex){
simplex.trySolution();return 0;
}
143
Results
• ILOG Solver/Planner finds a solution in 6 nodes (.29 seconds on laptop)
• Straightforward ILOG Solver finds a solution in 8024 nodes (1.8 seconds on a laptop)
• Again, CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution
144
Example: The Dutch TrainsScheduling intercity trains
Amsterdam,Rotterdam,Roosendaal,Vlissengen
Without coupling constraints, multi-commodity integer flow problem
With coupling constraints, a DLP with an integer relaxation
Additional logic handled directly in 2LP with CPLEXDisjunctive Programming and Cooperating Solvers,
CSTS 98 (Kluwer, edited by D. Woodruff)
Conclusions
• CP and MIP are powerful techniques that can solve many combinatorial problems
• Each has preferred formulations• Can get even greater benefits when combining
CP and IP algorithms
146
Recent and Current Work
Beaumont
Beringer, DeBacker
Balas, Ceria, Cornuejols.
Wallace, Rodosek, Schrimpf
Heipke, Colombani
Bockmayr
McAloon, Tretkoff, Wetzel
147
III. Exploiting Randomization to Solve Hard Combinatorial
Problems
148
Background
Combinatorial search methods often exhibita remarkable variability in performance. It is common to observe significant differences between:
- different heuristics- same heuristic on different instances- different runs of same heuristic with different seeds (stochastic methods)
149
Main Claim
One can take advantage of the extreme variability of combinatorial search methods:
One can One can improve the performance of a improve the performance of a deterministic complete methoddeterministic complete method, by , by introducing a introducing a stochastic elementstochastic element, while , while maintaining maintaining completeness.completeness.
We’ll explain We’ll explain WHYWHY that is the case. that is the case.
150
A Structured Benchmark Domain for Studying the
Distributions of Search Methods Stochasticity in Search ProceduresIntriguing Properties of Complete Backtrack Style AlgorithmsConsequences for Algorithm Design - Rapid Randomized RestartsPortfolio of Algorithms
151
Structured Benchmark Domain
152
Study of local and systematic search methods has been driven by:Random instance distributions (Hogg et al. 96). Limitation: lack of structure that characterizes realistic problems;
Highly structured problems (Fujita at al. 93). Limitation: “too much” structure.
We propose a benchmark domain that We propose a benchmark domain that bridges the gap between purely random bridges the gap between purely random instances and highly structured problems.instances and highly structured problems.
Background
Gomes and Selman 1997 - Proc. AAAI-97
153
Defn.: a pair (Q, *) where Q is a set, and * is a binary operation on Q such that a * x = b ; y * a = bare uniquely solvable for every pair of elements a,b in Q.
The multiplication table of its binary operation defines a latin square (i.e., each element of Q appears exactly oncein each row/column).
Example:Quasigroup of order 4
Quasigroups
154
Given a partial latin square, can it be completed?
Example:
Quasigroup Completion Problem (QCP)
155
Quasigroup Completion Problem A Framework for Studying Search
NP-Complete (Colbourn 1983, 1984; Anderson 1985).
Has a structure not found in random instances.
Leads to interesting search problems when structure is perturbed.
The study of this problem led us to identifythe unusual distributions of combinatorial search (Gomes, Selman & Crato --- CP97)
156
Aside: Applications of Quasigroups
Design of statistical experimentseliminating data dependencies
Scheduling/Timetabling (Anderson 1992)
completing a schedule given a set of pre-defined events
Automated theorem proving (Fujita et al. 1993)
existence vs. non-existence of quasigroups with intricate mathematical properties
157
Example: Scheduling of Drug Experiment
Given 5 different drugs, test the effects of the different medications on 5 different subjects over different days of the week.
Use constraint:No two people get same brand on the same day (eliminate bias for day of the week).
158
Quasigroup Completion SU
BJE
CT
DAYMon. Tues. Wed. Thurs. Fri.
Tim
Sue
Frank
Teresa
Todd
Tylenol Aleve Bayer ExhedrinExhedrin Advil
Aleve Bayer Exhedrin Advil Tylenol
Bayer Exhedrin AdvilAdvil Tylenol Aleve
Exhedrin AdvilAdvil Tylenol Aleve Bayer
Advil Tylenol Aleve Bayer Exhedrin
(*) Pre-assigned(*) Pre-assigned
159
QCP has a natural formulation as a ConstraintSatisfaction Problem
variable for each NxN entryconstraints capture row/column requirementvariable assignments capture pre-assigned values
160
How does the difficulty of QCP vary with the fraction of pre-assignment?
161
Fraction of pre-assignment
Med
ian
num
ber
of b
ackt
rack
s (lo
g)
Overconstrained areaUnderconstrained
area
Critically constrained area
162
Complexity Graph shows (up to order 20):
curve peaks around 42% of pre-assignment ---
critically constrained area.critically constrained area.
under-constrainedunder-constrained and over-over-constrainedconstrained areas are easier.
163
Directly related to the peak incomputational difficulty is the so-called phase transition graph forthe QCP problem.
164Fraction of pre-assignment
Frac
tion
of U
nsol
ved
case
s
Almost all unsolvable area
Almost all solvable area
Phase transition area
165
Phase Transition
QCP Phase Transition --- threshold phenomenonthreshold phenomenon from almost all solvable to almost all unsolvablefrom almost all solvable to almost all unsolvable --- occurs around 42% of preassignment.
It’s called a phase transition because of the closerelation to state transition phenomena studied inphysics, such as the melting of a solid into aliquid.
166
Exploiting Structure
167
Forward Checking Arc Consistency on binary constraints
Exploiting Structure in QCP
168
Arc Consistency on Binary Constraints
Further Exploiting Structure in QCP
Shaw, Stergiou and Walsh - ECAI98
General Arc Consistency on all different
constraints
169
Enforcing General Arc Consistency on All Different Constraints
• Beautiful example of integration of AI/OR techniques for a well defined sub-problem
• Propagation uses Maximum Matching problem (particular case of Network Flow problems which have polynomial time complexity)
Regin - AAAI94
170
Further Exploiting Structure in QCP
By enforcing general arc consistency on all different constraints problems up to order 50 could be solved!
Shaw, Stergiou and Walsh - ECAI98 Regin - AAAI94
171
Stochasticity in Search ProceduresStochasticity in Search Procedures
172
Background
Stochastic strategies have been very successful in the area of local search.
Limitation: inherent incomplete nature of local search methods.
We want to explore the addition of a We want to explore the addition of a stochastic element to a systematic search stochastic element to a systematic search procedure without losing completeness.procedure without losing completeness.
173
We introduce stochasticity in a backtrack search method by randomlybreaking ties in variable and/or valueselection.
Compare with standard lexicographictie-breaking.
174
Randomized Strategies
Strategy Variable sel. Value sel.
DD deterministic deterministic
DR deterministic random
RD random deterministic
RR random random
175
176
177
178
Lesson: Randomized tie-breaking can improve performance over a purely deterministic strategy.
Next: But we can obtain a more dramatic advantage from randomization ...
179
Cost Distributions
Key Properties:
I Erratic behavior of mean.I Erratic behavior of mean.
II Distributions have “II Distributions have “heavy tailsheavy tails”. ”.
180
Median = 1!
samplemean
number of runs
3500!
500
2000
181
1
182
75%<=30
Number backtracks Number backtracksProp
ortio
n of
cas
es S
olve
d
5%>100000
183
Heavy-Tailed Distributions
… … infinite variance … infinite meaninfinite variance … infinite mean
Introduced by Pareto in the 1920’s--- “probabilistic curiosity.”
Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena.
Examples: stock-market, earth-quakes, weather,...
184
Decay of Distributions
Standard --- Exponential Decay e.g. Normal:
Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy:
Pr[ ] , ,X x Ce x for someC x 2 0 1
Pr[ ] ,X x Cx x 0
185Standard Distribution
(finite mean & variance)
Power Law Decay
Exponential Decay
186
Normal, Cauchy, and Levy
Normal - Exponential Decay
Cauchy -Power law DecayLevy -Power law Decay
187
Tail Probabilities (Standard Normal, Cauchy, Levy)
c Normal Cauchy Levy0 0.5 0.5 11 0.1587 0.25 0.68272 0.0228 0.1476 0.52053 0.001347 0.1024 0.43634 0.00003167 0.078 0.3829
188
How to Check for “Heavy Tails”?
Log-Log plot of tail of distributionshould be approximately linear.
Slope gives value of infinite mean and infinite varianceinfinite mean and infinite variance
infinite varianceinfinite variance
1
1 2
189
Example of Heavy Tailed Model(Random Walk)
Random Walk:•Start at position 0•Toss a fair coin:
• with each head take a step up (+1)• with each tail take a step down (-1)
X --- number of steps the random walk takes to return to position 0.
190
The record of 10,000 tosses of an ideal coin(Feller)
Zero crossing Long periods without zero crossing
191
Random Walk
Heavy-tails vs. Non-Heavy-Tails
Normal(2,1000000)
Normal(2,1)
O,1%>200000
50%
2
Median=2
1-F(
x)U
nsol
ved
frac
tion
X - number of steps the walk takes to return to zero (log scale)
192
466.0
319.0
153.0
Number backtracks (log)
1-F(
x)U
nsol
ved
frac
tion
1 => Infinite mean
Heavy-tails in QCP Domain
193
The Log-Log plot shows a linear relationover many orders of magnitude. This isclear evidence of heavy-tailed behavior.
194
195
196
Heavy Tailed Cost Distribution
0.1
1
1 10 100 1000 10000 100000
log( Backtracks )
log(
1 -
F(x)
)
197
The Log-Log plot shows a linear relationover many orders of magnitude. This isclear evidence of heavy-tailed behavior.
198
By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution.
Right-hand side: non-negligible fraction of very long runsLeft-hand side: non-negligible fraction of very short runs
199
70%>250000
15!
1%<=650!
Sports Scheduling
Number backtracks (log)
Cum
ulat
ive
Dis
trib
utio
n Fu
nctio
n
200Standard Distribution
(finite mean & variance)
Power Law Decay
Exponential Decay
Also, heavy tails on left. (High probability of very short runs.)
201
Consequence for algorithm design:
Use rapid restarts or parallel / inter-leaved runs
Super linear speedups!!!
202
X XX XX
solved10 101010 10
Sequential: 50 +1 = 51 secondsParallel: 10 machines --- 1 second 51 x speedup
Super-linear Speedups
Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup
203
Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution.
RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left.
Restarts provably eliminate heavy tails (Gomes, Selman & Crato )
204
Sketch of proof of elimination of heavy tails
Let’s truncate the search procedure after m backtracks.
Probability of solving problem with truncated version:
Run the truncated procedure and restart it repeatedly.
pm X m Pr[ ]
X numberof backtracks to solve the problem
205
Y total number backtracks with restarts
F Y y pmY m c e c y
Pr[ ] ( ) /1 1
2
Number of starts Y m Geometric pmRe / ~ ( )
Y - does not have Heavy Tails
206
Restarts
70%unsolved
250~ 62.5 restarts
1-F(
x)U
nsol
ved
frac
tion
Number backtracks (log)
207
Example of Rapid Restart Speedup(planning)
1000
10000
100000
1000000
1 10 100 1000 10000 100000 1000000
log( cutoff )
log
( bac
ktra
cks
)
20
2000 ~100 restarts
Cutoff (log)
Num
ber
back
trac
ks (l
og)
~10 restarts100000
208
Deterministic
Logistics Planning 108 mins. 95 sec.Scheduling 14 411 sec 250 sec
(*) not found after 2 days
Scheduling 16 ---(*) 1.4 hoursScheduling 18 ---(*) ~18 hrsCircuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.
Summary Results
R3
209
Our results provide the first indication of heavy-tailed distri-butions in a computational model.
Overall insight:Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy.
210
Heavy-Tailed Distributionsin Other Domains
Quasigroup Completion Problem
Graph Coloring
Logistic Planning
Circuit Synthesis
Gomes, Selman, and Crato 1997; Gomes, Selman, McAloon, and Tretkoff 1998; Gomes,Kautz, and Selman 1998;
211
Deterministic
Logistics Planning 108 mins. 95 sec.Scheduling 14 411 sec 250 sec
(*) not found after 2 days
Scheduling 16 ---(*) 1.4 hoursScheduling 18 ---(*) ~18 hrsCircuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.
Summary Results
R3
212
Rapid Restart Speedup
1000
10000
100000
1000000
1 10 100 1000 10000 100000 1000000
log( cutoff )
log
( bac
ktra
cks
)
213
Our results provide the first indication of heavy-tailed distri-butions in a computational model.
Overall insight:Overall insight:Randomized tie-breaking withrapid restarts gives powerfulsearch strategy.
214
Heavy-Tailed Distributionsin Other Domains
Quasigroup Completion Problem
Graph Coloring
Logistic Planning
Circuit Synthesis
Gomes, Selman, and Crato 1997 - Proc. CP97; Gomes, Selman, McAloon, and Tretkoff 1998 - Proc AIPS98; Gomes,Kautz, and Selman 1998 - Proc. AAAI98.
215
Algorithm Portfolio Design
Gomes and Selman 1997 - Proc. UAI-97; Gomes, Selman, and Crato 1997 - Proc. CP97.
216
Motivation
The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances.
Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths.
217
Branch & Bound:Best Bound vs. Depth First Search
218
Branch & Bound(Randomized)
Standard OR approach for solving Mixed Integer Programs (MIPs)• Solve linear relaxation of MIP• Branch on the integer variables for which the solution of the LP relaxation is non-integer:
apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value)
• Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values
219
Branch & BoundDepth First vs. Best bound
Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound
(standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic
Depth-first - often quickly reaches an integer solution(may take longer to produce an overall optimal value)
220
Portfolio of Algorithms
A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.Goal: to improve on the performance of the component algorithms in terms of:
expected computational cost“risk” (variance)
Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.
221
Depth-first vs. Best-bound(logistics planning)
Number of nodes
Cum
ulat
ive
Freq
uenc
ies
Depth-First~50%
Best-Bound
~30%
222
Depth-First and Best and Bound do not dominate each other overall.
223
Heavy-tailed behavior of Depth-first
224
Portfolio for heavy-tailed search procedures (2 processors)
0 DF / 2 BB
2 DF / 0 BB
Standard deviation of run time of portfolios
Expe
cted
run
tim
e of
por
tfol
ios
225
Portfolio for heavy-tailed search procedures (6 processors)0 DF / 6 BB
6 DF / 0BB
Standard deviation of run time of portfoliosExpe
cted
run
tim
e of
por
tfol
ios
5 DF / 1BB
3 DF / 3 BB4 DF / 2 BB
Efficient set
226
Portfolio for heavy-tailed search procedures (20 processors)
0 DF / 20 BB
20 DF / 0 BB
Standard deviation of run time of portfolios
Expe
cted
run
tim
e of
por
tfol
ios
227
Portfolio for heavy-tailed search procedures (2-20 processors)
228
A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena.
229
Summary of Randomization Considered randomized backtrack search.
Showed Heavy-Tailed Distributions.
Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and
scheduling problems.
Portfolio of Algorithms for cases where no single heuristic dominates
230
Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions.
Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs
Experimentally validated on previously unsolved planning and scheduling problems.
Portfolio of Algorithms for cases where no single heuristic dominates
231
IV. CONCLUSIONS
232
Important Themes in OR
Linear Programming (Mixed) Integer Programming
Exploit Structure e.g., Network Flow Problems
Duality very elegant theory in LP sensitivity analysis
233
Opportunities for Integration of AI/OR
OR methods:Have focused on tractable representations (LP)Have demonstrated the ability to identify optimal and locally optimal solutionsLIMITATION: Restricted to rigid models with limited expressive powerAI methods:Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop.LIMITATION: Rich representations in general lead to intractable problemsCHALLENGE: good representations / fast & good solutions
,
234
Opportunities for Integration of AI/OR
AI methods are becoming competitive
AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples:
SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches.ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX.OR people have acknowledge the benefits of combining OR and AI methods
235
Opportunities for Integration of AI/OR
Exploiting Duality in CSP frameworks
Exploiting Randomization
Hybrid Solvers
236
Opportunities for Integration of AI/OR
Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX
local constraint propagation - local consistency algs global constraint propagation - LP relaxations
Only a hybrid approach could prove optimality, e.g.:Hoist scheduling (Rodosek & Wallace 1998)Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998)
237
Updated version of tutorial slides
www.cs.cornell.edu/gomes/
TalksDemos
238
Appendix
239
Portfolio of Algorithms
A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.
A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost.
The standard deviation of the portfolio is a measure of the risk inherent to the portfolio.
240
Portfolio of Algorithms
Goal: to improve on the performance of the component algorithms in terms of:
expected computational cost“risk” (variance)
Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.
241
AppendixPortfolio of Algorithms
Goal: to improve on the performance of the component algorithms in terms of:
expected computational cost;risk;
Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk.
Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk.
242
Appendix Portfolio of Two Algorithms
Let us consider the random variables:
A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist;
A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist;
243
Appendix Portfolio of Two Algorithms
Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2).
Let us consider the random variable:
X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist;
244
AppendixPortfolio of Two Algorithms
Given N processors, and
P[X x]Nii 1
NP[A1 x]i P[A1 x](N i)
n N1 n2 0
245
Appendix Portfolio of Algorithms
Given N processors, such that and n1n N n2 1 ,
0 1 n N
P[X x]n1
i'i' 0
n1P[A1 x]i
'P[A1 x](n1 i' )
i
N
1
n2
i' 'P[A2 x]i
' 'P[A2 x](n2 i' ' )
i i i' ' ' and the term in the summation is 0 when 2'',0'' nii
246
Preliminary Research on Structure of Search Spaces
247
Fringe of Search Tree
248
Fractal Dimension
249
Fractal Dimension When plotting the length of a curve as a function of the measuring tool on a log-log plot, one obtains a
linear relationship:
L - the measured length; s - length of the yardstick;
c and d are constants;
Mandelbrot introduced the fractal dimension D = d +1; A straight line has D = 1.0; The coast of Britain has fractal dimension 1.22;
The higher D the more fractal the curve is.
dscL )/1(
250
Heavy-Tailed Behavior vs Non-heavy-tailed behavior