integration of artificial intelligence and operations research for

250
1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University [email protected] Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com

Upload: nguyenanh

Post on 10-Feb-2017

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Integration of Artificial Intelligence and Operations Research for

1

Integration of Artificial Intelligence and Operations Research Techniques

for Combinatorial Problems

Carla P. GomesCornell University

[email protected]

Ken McAloon and Carol TretkoffILOG

{mcaloon,tretkoff}@ilog.com

Page 2: Integration of Artificial Intelligence and Operations Research for

2

AI, OR, and CS

AI OR

CS

Page 3: Integration of Artificial Intelligence and Operations Research for

3

Integration of Artificial Intelligence &Operations Research

TechniquesAI

RepresentationsConstraint Languages

Logic FormalismsObject-Oriented Prog.

Bayesian NetsRule Based Systems

•  •  •Tools

Constraint PropagationSystematic SearchStochastic Search

•  •  •Pros / Cons

Rich RepresentationsComputational Complexity

ORRepresentations

MathematicalModeling LanguagesLinear & Non-linear

(In)Equalities•  •  •Tools

Linear ProgrammingMixed-Integer Prog.Non-linear Models

•  •  •Pros / Cons

More Tractable (LP)Primarily Complete InfoLimited Representations

Combinatorial Problems

Planning Scheduling

THE CHALLENGEAI OR

UNIFY APPROACHES TO:

SCALE UP SOLUTIONSHANDLE UNCERTAINTYANALYZE COMPLEXITY

(phase transition)FRAGILE

EXPLOIT PROBLEM STRUCTURE

INCREASE ROBUSTNESS

31 - 45: ACPOWER? 0 NUM-UNAV-RESS 1UNAV-RES-MAP (DIV2 D24BUS-3 D24-2 D24-1) (ACPLOSS D24BUS-3 D24-2

ROME LABORATORY OUTAGE MANAGER (ROMAN)

Parameters Load Run Gantt Charts Utilities Parameters Load Run Gantt Charts Utilities

AC-POWER StatusAC PowerDIV1DIV2DIV3DIV4

0 10 20 30 40 50 60 70 80 90

GoalStart

Page 4: Integration of Artificial Intelligence and Operations Research for

4

Outline

I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers

III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions

Page 5: Integration of Artificial Intelligence and Operations Research for

5

I. Short OR Overview

Page 6: Integration of Artificial Intelligence and Operations Research for

6

Outline for Linear Programming and Integer Programming

• Standard Form of LP and a Simple Example • Geometric Interpretation of LP• Complexity issues• MIP• Example: Fast Food• Example: Capacitated Warehouse• Example: 911

Page 7: Integration of Artificial Intelligence and Operations Research for

7

Outline

1. Short Overview of OR

2. Constraint Programming3. Cooperating Solvers

4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard

Combinatorial Problems 6. Conclusions

Page 8: Integration of Artificial Intelligence and Operations Research for

8

Optimization Technology Evolution

Dispatch Rules

1960 1970 1980 1990

SA, GA, Tabu

CPMPERT

Constraint-based Scheduling

19981947

Primal Simplex LP

ParallelLP/MIP

ConcurrentScheduling

Interior Point

ConstraintPropagation

Large IPsMIP

ShiftingBottleneck

First CP Systems

CooperatingSolvers (LP/CP)

Global constraints

Barrier LPBarrier Crossover

Dual Simplex Implementation

Dual Simplex

Page 9: Integration of Artificial Intelligence and Operations Research for

9

1. Short OR Overview

Page 10: Integration of Artificial Intelligence and Operations Research for

10

Outline for Linear Programming and Integer Programming

• Standard Form of LP and a Simple Example • Geometric Interpretation of LP• Complexity issues• MIP• Example: Fast Food• Example: Capacitated Warehouse• Example: 911

Page 11: Integration of Artificial Intelligence and Operations Research for

11

An LP Story

A factory can produce n products from m partsFor product j it needs aij units of part iThere are bi units of part i availableEach unit of product j sold earns cj

Amount of each product to make is unknown xj 0Each part i determines a constraintai1 x1 + … + ain xn bi

Obvious solution: do nothing

Better: maximize c1 x1 + … + cn xn

Page 12: Integration of Artificial Intelligence and Operations Research for

12

Standard Forms of LP

A linear program (LP) in standard form (Dantzig 1947)

max cTx subject to Ax b x 0

Input data: c (n x 1), A (m x n), b (m x 1).Variables: x (n x 1)

Page 13: Integration of Artificial Intelligence and Operations Research for

13

Standard Forms of LP

// The objective functionmax c1 x1 + … + cn xn

// The constraintssubject to

a11 x1 + … + a1n xn b1

...am1 x1 + … + amn xn bm

x1 0 , … , xn 0

Page 14: Integration of Artificial Intelligence and Operations Research for

14

Standard Forms of LP

• In OR emphasis is on optimality

• Solution means optimal solution

• Feasible solution means solution in the ordinary sense

Page 15: Integration of Artificial Intelligence and Operations Research for

15

Standard Forms of LP

Interpretation of standard form:• xj = amount of product j to make• cj = revenue per unit product j• bi = available amount of component i• aij = units of i used per unit of j produced

The constraints “say”: aijxj = units of i used by j= units of i used bi

Page 16: Integration of Artificial Intelligence and Operations Research for

16

What are models?A model is a data-independent abstraction of a problemA model lets you write down the mathematical representation of a

model independently of the data

ProjectModel Data

OneProblemInstance

Page 17: Integration of Artificial Intelligence and Operations Research for

17

Products Could be Jewelry

Products: Rings and EarringsComponents: Gold and Diamonds

One ring requires 3 units of Gold, and 1 DiamondOne set of earrings requires 2 units of Gold, and 2 Diamonds

Total Gold and Diamonds are limited

Profit is different for Rings than for Earrings

Products = { rings, earrings };Components = { Gold, Diamonds };

demand = [ [3, 1], [2, 2] ];

stock = [150, 180];

profit = [60, 40];

Page 18: Integration of Artificial Intelligence and Operations Research for

18

Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl

Components: Nitrogen, Hydrogen, Chlorine

One unit of Gas requires 1 unit of Nitrogen, 3 units HydrogenOne unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit

of Chlorine

Total Nitrogen, Hydrogen, Chlorine is limited

Profit is different for Gas than Chloride

Products Could be Chemicals

Products = { gas, chloride };

Components = { nitrogen, hydrogen, chlorine };

demand = [ [1, 3, 0], [1, 4, 1] ];

stock = [50, 180, 40];

profit = [30, 40];

Page 19: Integration of Artificial Intelligence and Operations Research for

19

The Problems Have One Model

enum Products ...;enum Components ...;

float+ demand[Products, Components] = ...;float+ profit[Products] = ...;float+ stock[Components] = ...;

var float+ production[Products];

maximize sum (p in Products) profit[p] * production[p]

subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c]};

Data

DecisionVariables

Objective Function

Constraints

Page 20: Integration of Artificial Intelligence and Operations Research for

20

OR Modeling Systems

• OPL• AMPL• 2LP• AIMMS• GAMS• MPL• ILOG Planner• etc

Page 21: Integration of Artificial Intelligence and Operations Research for

21

The Dual

The dual linear program (von Neumann 1947);

min yTbsubject to yTA c y 0

Variables y (m x 1)

Awesome Symmetry - The dual of the dual is the primal

Page 22: Integration of Artificial Intelligence and Operations Research for

22

Rows and Columns Exchanged

min b1 y1 + … + bm yn

subject to

a11 y1 + … + am1 ym c1

...a1n y1 + … + amn ym cn

y1 0 , … , ym 0

Page 23: Integration of Artificial Intelligence and Operations Research for

23

Duality Theorem

Theorem: min yTb = max cTx

• Consequence: This turns optimality problem into a feasibility problem in x and yAx b x 0yTA cT

y 0 yTb = cTx

• Consequence: Enumeration not needed to verify optimality

Page 24: Integration of Artificial Intelligence and Operations Research for

24

Duality Theorem

• Sensitivity Analysis• Consequence: The solution values y* for the y variables

yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b

yi * = Z / bi where Z is the optimum

Reference: McAloon and Tretkoff [1996] Wiley

Page 25: Integration of Artificial Intelligence and Operations Research for

Duality

Two different views of the same phenomenon

Point vs Set

Arc vs Node

Momentum vs Position

Vector vs Hyperplane

Landlord vs Renter

Page 26: Integration of Artificial Intelligence and Operations Research for

26

Simplex and Barrier

• The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function

• The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope

Page 27: Integration of Artificial Intelligence and Operations Research for

27

Geometric Interpretation of LP

X

Y

Max: Xsubject to:

-X + Y <= 4X + 4*y <= 362*X + y <= 23X + Y >= 4Y >= X + 10

(0,4)

(4,0) (8,0)

(10,3)

(4,8)

Barrier

Simplex

Page 28: Integration of Artificial Intelligence and Operations Research for

28

Complexity of Linear Programming

Simplex MethodWorst-case --- exponential (Klee and Minty 72)Practice --- good performance

Ellipsoid MethodKhachian’s Ellipsoid Method Worst-case --- polynomialPractice --- poor performance

Page 29: Integration of Artificial Intelligence and Operations Research for

29

Complexity of Linear Programming

Interior Point Methods or Barrier Methods“Karmarkar’s” (and variants) Method

Worst-case --- polynomialPractice --- good performance

Page 30: Integration of Artificial Intelligence and Operations Research for

30

Complexity of Linear Programming

• Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice.

• Which method performs best is problem dependent.

Page 31: Integration of Artificial Intelligence and Operations Research for

31

Success Stories

• Industrial PlanningGiven current resources, decide what to produce in what quantity

• Supply Chain ManagementMultiperiod planning models that link flow from one period to the next

• Network FlowHow best to route goods across a network

Page 32: Integration of Artificial Intelligence and Operations Research for

32

Assumptions of Linear Programming

• Linearitywhen violated: ( xy = 50)Nonlinear programming

• Continuity when violated: (x integral)(Mixed) Integer programming

Page 33: Integration of Artificial Intelligence and Operations Research for

33

Assumptions of Linear Programming - continued

• No Disjunctive Constraintswhen violated: (x 100 or x 0)Disjunctive programmingAdditional 0-1 variables and Big M constraints

• Certaintywhen violated: (cost c is a random variable)Stochastic programming

Page 34: Integration of Artificial Intelligence and Operations Research for

34

Search and MIP

• In order to deal with variables that must have integer values in the solution, a search must be performed.

• Mixed Integer Programming problems are combinatorial optimization problems and are NP hard

• feasibility is NP-Complete• verifying optimality is co-NP-Complete

Page 35: Integration of Artificial Intelligence and Operations Research for

35

MIP and Combinatorial Optimization

• These problems have been attacked by both the AI and OR communities.

• In AI, these problems are attacked as CSPs or as Planning Problems.

• In OR, they are done as MIPs and use linear relaxation to help guide the search.

• The overriding idea in each case is to limit search.

Page 36: Integration of Artificial Intelligence and Operations Research for

36

Integer Program: All Integer Points in Region

Page 37: Integration of Artificial Intelligence and Operations Research for

37

Cut to Create Integer Vertex

Integer Vertex

Page 38: Integration of Artificial Intelligence and Operations Research for

38

Example - Fast Food

• Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet?

• If so, what is the least he can do it for?

Page 39: Integration of Artificial Intelligence and Operations Research for

39

Nutritional Requirements

• At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron

• At least 55 grams of protein• At most 3000 milligrams of sodium• At most 30% of the calories can come from fat

• Nutritional information is available from fast food outlets

Page 40: Integration of Artificial Intelligence and Operations Research for

40

College Student’s Requirements

• At least 2000 calories a day• No more than 3 servings of any one food• Milk only with cereal and not as a stand-alone

drink

Page 41: Integration of Artificial Intelligence and Operations Research for

41

Fast Food - MIP Model

• We will have variables Servk to represent the number of servings of item k in the plan.

• The variable Servk will have to take an integer value for the solution to be valid.

• The objective function: Z for cost

Page 42: Integration of Artificial Intelligence and Operations Research for

42

Fast Food - MIP Model

• Let foodk,j represent the percent of RDA of nutrient j in a serving of item k

• The for each nutrient j, we have a constraint

foodk,j Servk 100 k

Page 43: Integration of Artificial Intelligence and Operations Research for

43

Fast Food - MIP Model

• Let sodiumk represent the amount of salt in a serving of item k

• For salt we have the constraint sodiumk Servk 3000 k

• Similarly for fat

Page 44: Integration of Artificial Intelligence and Operations Research for

44

Fast Food - MIP Model

• Let costk represent the cost of a serving of item k

• For the objective function we have the defining constraint

costk Servk = Z k

Page 45: Integration of Artificial Intelligence and Operations Research for

45

Fast Food - Solution

• With a MIP solver and a way to input these constraints we ask for

• a solution that makes the variables Servk integral

• and which minimizes Z

Page 46: Integration of Artificial Intelligence and Operations Research for

46

MIP Solution Technique

• What the MIP solver does is to carry out a branch and bound search guided by

• the linear relaxation– the solution to the problem with the integrality

requirements relaxed

• Initialize the global variable best_so_far to 1000 (or something else very big).

Page 47: Integration of Artificial Intelligence and Operations Research for

47

At a Node

• Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if

z* best_so_far ,

• If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node.

Page 48: Integration of Artificial Intelligence and Operations Research for

48

Branching at a node

• Choose a variable Servk whose value s* is not integral.

• Typical heuristic: most non-integral variable

• Create two child nodes,• add Servk floor(s*) • add Servk ceil(s*)

Page 49: Integration of Artificial Intelligence and Operations Research for

49

Good News

• The linear relaxation can prune nodes before all variables Servk are forced to be integral.

• Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why

• A solution to the LP is at a vertex• A vertex is defined as the simultaneous

solution of the equality form of n linearly independent constraints

• Many of these constraints are integer bounding constraints yielding X = integer

Page 50: Integration of Artificial Intelligence and Operations Research for

50

Arboreally Speaking

• Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A*

• If the linear relaxation is tight

| z*linear - z*integral | is relatively small

then z*linear is an excellent evaluation function

Page 51: Integration of Artificial Intelligence and Operations Research for

51

Answer - Fast Food

Total cost is 8.71

Buy 3 burgersBuy 2 friesBuy 3 honeysBuy 1 yogurt...

Page 52: Integration of Artificial Intelligence and Operations Research for

52

Example - Fixed Cost

• Warehouses must be rented in order to supply stores and we must decide which to use

• For each store j we know its monthly demand dj

• For each warehouse i we know its capacity ki

• For each warehouse i we know the fixed cost to run it each month fci

• For each pair i, j we know the monthly cost cij of supplying j from i

Page 53: Integration of Artificial Intelligence and Operations Research for

53

Example - Fixed Cost

• Xij is the fraction of store j’s demand met by i• Xij 1

• Yi is a “fuzzy” boolean• it will be 1 if the warehouse is rented• 0 if it is not rented

• Yi 1

Page 54: Integration of Artificial Intelligence and Operations Research for

54

Example - Fixed Cost

• Each store must be supplied X ij = 1 i

• Warehouse capacity can not be exceeded dj Xij ki j

• Tighter dj Xij ki Yi j

Page 55: Integration of Artificial Intelligence and Operations Research for

55

Example - Fixed Cost

• Objective function

fci Yi + cij Xij

• This yields a MIP with 0-1 variables Yi

Page 56: Integration of Artificial Intelligence and Operations Research for

56

Branch and Cut: An Enhanced Solution Method

• Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation

Xij Yi

• Add at a node if violated by solution to linear relaxation

• Powerful method - will solve the Imperial College OR lib CW problems very easily

Page 57: Integration of Artificial Intelligence and Operations Research for

57

Example - Call 911

• PCTs answer the phone 24 hours a day, 7 days a week.

• It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate.

• Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours.

Page 58: Integration of Artificial Intelligence and Operations Research for

58

Example - Call 911

• Each PCT has a work week of 5 days followed by 2 days off.

• Want to meet the demand with minimal or near-minimal number of PCTs.

• So need to determine how many PCTs start their work week at each hour h of the week

Page 59: Integration of Artificial Intelligence and Operations Research for

59

Modeling 911

• A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0 h < 168.

Page 60: Integration of Artificial Intelligence and Operations Research for

60

Modeling 911

• A continuous variable Z will represent the objective function

Pcth = Z h

• There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed.

Page 61: Integration of Artificial Intelligence and Operations Research for

61

Modeling 911

• For this constraint we need to represent the number of workers who are on duty at time h

• Certainly, those who start the week at time h are here, as are those who started the week at time h - 1

• And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break.

Page 62: Integration of Artificial Intelligence and Operations Research for

62

Modeling 911

• This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h

Pctw bh w

Page 63: Integration of Artificial Intelligence and Operations Research for

63

Call 911 solved with progressive roundoff

int b[168] = { // New York City 91130,24,18,15,14,14,15,25,34,36,38,40,41,43,46,57,57,59,61,59,55,50,45,38,32,25,20,17,15,13,17,25,32,35,38,40,42,43,47,58,57,57,59,57,55,52,47,41,33,25,20,17,15,13,15,25,32,33,37,39,42,43,47,57,56,57,57,56,53,50,47,41,34,27,22,19,16,15,16,25,31,35,37,40,44,45,48,57,57,56,58,56,53,53,46,41,34,28,23,19,16,15,17,25,33,37,39,42,45,47,51,59,58,60,61,61,57,56,57,55,48,41,35,30,26,20,18,22,26,32,42,46,49,53,54,56,56,56,59,59,57,57,56,56,52,46,41,34,29,23,18,19,25,31,36,41,46,50,52,53,52,53,54,53,50,49,45,40

};

Page 64: Integration of Artificial Intelligence and Operations Research for

64

Modeling 911

• Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small.

• The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer.

• The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers.

Page 65: Integration of Artificial Intelligence and Operations Research for

65

Modeling 911

• For this and many other applications, heuristics can be used to develop good solutions

• Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc.

Page 66: Integration of Artificial Intelligence and Operations Research for

66

Solving the Integer Problem

main() // Planner Code{IlcInitFloat();IlcManager m(IlcNoEdit);IlcLinOpt simplex(m);IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168);int i,j,k,h,n;

Page 67: Integration of Artificial Intelligence and Operations Research for

67

Solving the Integer Problem

// Pctw bh w

for(h=0;h<168;h++) { // for each hour of 168 in weekfor(j=0;j<168;j++)coeffs[j] = 0;for(k=0;k<5;k++) // for each of 5 daysfor(j=k*24;j<k*24+8;j++) // for each of 8if (j!=(k*24+4)) // hourscoeffs[(h+168-j)%168] = 1;simplex.add(IlcScalProd(coeffs,Pct) >= b[h]);}

Page 68: Integration of Artificial Intelligence and Operations Research for

68

Solving the Integer Linear Problem

IlcFloatVar Z = IlcSum(Pct);// Objectivesimplex.setObjMin(Z);for(i=0;i<168;i++) { //Progressive roundoffn =ceil(simplex.getCurrentValue(Pct[i]));// Fix variable and re-optimizesimplex.add(Pct[i] == n);}m.out() << “Number of Pcts needed is “ << Z << endl;m.end();

}

Page 69: Integration of Artificial Intelligence and Operations Research for

69

Solution

• This code finds a solution with 208 workers in a couple of seconds. The optimum is 207.

• The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution

• [Bartholdi,Ratliffe,Orlin]

Page 70: Integration of Artificial Intelligence and Operations Research for

70

2. Constraint Programming

Page 71: Integration of Artificial Intelligence and Operations Research for

LP/MIP is Beautiful, except when

• Variable domain information is important to the search strategy

• especially critical in scheduling• The problem variables range over symbolic

entities and there are lots of symmetries• timetabling

• The MIP representation can be too verbose or awkward

• configuration• There are just too many constraints

e.g. vehicle routing

Page 72: Integration of Artificial Intelligence and Operations Research for

72

Mathematical Basis of Constraint Programming (CP)

The Constraint Satisfaction Problem:• Suppose a finite set of variables is given

and with each variable is associated a non-empty finite domain.

• A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk) D1 x …x Dk.

• A constraint satisfaction problem (CSP) is given by a finite set of constraints.

• A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied.

Page 73: Integration of Artificial Intelligence and Operations Research for

73

Domain Reduction

• In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints.

• With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint.

• Accelerates convergence toward a solution

• Detects infeasibility

Page 74: Integration of Artificial Intelligence and Operations Research for

74

Constraint Propagation

• The other key issue is communication among the constraints or subproblems.

• The basic method used is called constraint propagation which links the constraints through their shared variables.

• The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints.

Page 75: Integration of Artificial Intelligence and Operations Research for

Monsieur Jordan Phenomenon

• Like prose, you have been doing constraint propagation all your life.

– Crossword puzzles

• Incomplete and so backtracking is needed– NY Times Sunday Crossword– Optical Illusions

• Origin: Vision analysis (Marr,Waltz et al)

Page 76: Integration of Artificial Intelligence and Operations Research for

76

Strengths of Constraint Programming

Constraint Programming provides a rich Rich

• Rich representation language.• CP variables naturally represent problem

entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT.

• Opportunity to choose a good heuristic for the solution strategy.

Page 77: Integration of Artificial Intelligence and Operations Research for

77

Which Method for Which App?

ProductMix

LP

ProductionPlanning

MIP

DistributionPlanning

MIP

Scheduling

ConstraintBased

Scheduling

Dispatching

CPLocal search

Configuration

CP Technology

Application

Linear => Disjunctive Constraints

Strategic => Operational Optimization

Page 78: Integration of Artificial Intelligence and Operations Research for

78

3. Cooperating Solvers

Page 79: Integration of Artificial Intelligence and Operations Research for

First Stop

CP/CP

Page 80: Integration of Artificial Intelligence and Operations Research for

80

Mother of All Examples - N Queens

• Do we think in terms of queens • Where do we place this queen ?

• Do we think in terms of squares • Will this square contain a queen ?

• These views are dual to one other

Page 81: Integration of Artificial Intelligence and Operations Research for

The Primal View

For each queen assign it a square

Place this queen in this square ?

Page 82: Integration of Artificial Intelligence and Operations Research for

The Dual ViewFor each square decide whether it will

have a queen

Place a queen in this square ?

Page 83: Integration of Artificial Intelligence and Operations Research for

83

The Primal Model

In which row do we place q[j] - the queen in column j

The constraints q[i] != q[j]q[i] - q[j] != i - jq[i] - q[j] != j - i

Note: no alldifferent constraint

Page 84: Integration of Artificial Intelligence and Operations Research for

84

Yet Another duality - rows vs columns

In which column do we place qq[i] the queen in row i

The constraints are the sameqq[i] != qq[j]qq[i] - qq[j] != i - jqq[i] - qq[j] != j - i

Page 85: Integration of Artificial Intelligence and Operations Research for

85

The Relationship

Can link them as inverse functions:

q[qq[i]] = iqq[q[j]] = j

The constraint propagation

i leaves domain of q[j] iff j leaves domain of qq[i]

Page 86: Integration of Artificial Intelligence and Operations Research for

86

In this primal/dual model

• Apply first-fail to q[i]

• Lo and behold one-third fewer fails

(Example from Jean Jordan’s thesis)

Page 87: Integration of Artificial Intelligence and Operations Research for

87

Page 88: Integration of Artificial Intelligence and Operations Research for

88

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

Page 89: Integration of Artificial Intelligence and Operations Research for

89

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

Q

X

X

X

X

X XX

X

X

X

Page 90: Integration of Artificial Intelligence and Operations Research for

90

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

Q

X

X

X

X

X XX

X

X

X

Q

X X

X

X

Page 91: Integration of Artificial Intelligence and Operations Research for

91

So …

The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint

The arc consistency of this global constraint is non-trivial to maintain

Network flow algorithmsflow goes from values to variableseach variable has unit demand and capacity

Page 92: Integration of Artificial Intelligence and Operations Research for

92

Remarks

• An IP model will encode the first dual solution

Will this square contain a queenx[i][j] = 0 or x[i][j] = 1

• A disaster beyond 30 queens• network structure on rows and columns lost

• Another example - sports scheduling

Page 93: Integration of Artificial Intelligence and Operations Research for

93

Constraints and Indices

• In IP symbols are represented by indices as opposed to values for variables.

• nurses, teams • Paris-St-Germain plays Manchester United on

day kxijk {0-1} to represent “team i plays team j on day

k”

• You can’t put symmetry breaking and other constraints on indices.

Page 94: Integration of Artificial Intelligence and Operations Research for

Second Stop

CP/IP

Page 95: Integration of Artificial Intelligence and Operations Research for

95

CP Is Powerful, But ….

• Sometimes, inconsistencies can be overlookedX - Y 12X + Y 10X in [1..20]Y in [1..20]

• Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution

• but an LP solver would spot this

Page 96: Integration of Artificial Intelligence and Operations Research for

96

2 Dimensional Bin Packing

• Application for the Automobile Industry built by Greg Glockner

Page 97: Integration of Artificial Intelligence and Operations Research for

97

2 Dimensional Bin Packing

• The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed.

The actual application involves circuit boards

• There are two complete models, one a CP model and the other an IP model.

The CP model directs the search The LP relaxation prunes the search space by

detecting infeasible nodes

Page 98: Integration of Artificial Intelligence and Operations Research for

98

2-D Bin Packing

Arrange circuit boards onto raw materialBoards may be rotatedUse same number of each board

Objective: minimize scrap

Classic combinatorial optimization problem

Page 99: Integration of Artificial Intelligence and Operations Research for

99

Solving 2-D Bin Packing

Use CP to generate partial solutions (nodes)Restrict placement to reduce fragmentation of

blank spaceUse tight LP to test feasibility

If any partial solution is infeasible in the LP, prune the tree immediately

CP constraints reduce the tree widthLP allows us to prune quickly

Page 100: Integration of Artificial Intelligence and Operations Research for

100

2 Dimensional Bin Packing

• As the search tree is traversed, the two models are in sync.

• Note that the variables used in the 2 models are disjoint

• The two models are dual to each other• The IP sees the model from the point of view

of the board, the large rectangle• The CP sees the model from the point of view

of the small rectangles• Solutions are obtained in minutes

Page 101: Integration of Artificial Intelligence and Operations Research for

101

2DBP: Basic CP Formulation• Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile

• Basic constraints:– Disjunctive constraints to prevent overlapping tiles

xi + wi xj yi + hi yj

xj + wj xi yj + hj yi

– Constraints to count the number of each tile type

Tile-oriented formulation

Page 102: Integration of Artificial Intelligence and Operations Research for

102

2DBP: Basic IP Formulation

Let xijnt = 1 if tile n of type t is in position (i, j)The constraints are:

tnjix

jix

tnx

ijnt

hjjwiijjiitnji

ntji

jiijnt

tt

,,,}1,0{

,1

,1

,,,:,,,

,

Grid-oriented formulation

Page 103: Integration of Artificial Intelligence and Operations Research for

103

2DBP: LP Issues

• The LP is large• The LP exhibits significant primal degeneracy• The LP exhibits significant dual degeneracy

Page 104: Integration of Artificial Intelligence and Operations Research for

104

2DBP: LP Issues

• The simplex algorithm cannot solve the LP• There is no way for a MIP solver to solve the IP

as such• The barrier method can solve the LP

Page 105: Integration of Artificial Intelligence and Operations Research for

105

2DBP: Summary

• CP as master problem– Orders tiles – Places tiles by position, then type– Selects tile type by frequency to scatter

tiles throughout the bin– Uses a one-ply lookahead constraint to

limit the position of following tile• LP relaxation prunes the CP search space

– Checks whether the partial solution will lead to an infeasible instance

• Use idiomatic formulations for CP and IP

Page 106: Integration of Artificial Intelligence and Operations Research for

106

2DBP: Remarks

• The CP fixes significant numbers of variables at each node

• The LP pre-processor greatly simplifies the LP

• Therefore, the lack of incrementality of the barrier method does not cost us

Page 107: Integration of Artificial Intelligence and Operations Research for

107

2DBP: Cooperative Algorithm Demo

Page 108: Integration of Artificial Intelligence and Operations Research for

108

Last Stop

• Constraint Programming and Local Search cooperation

• Another example of duality in action

Page 109: Integration of Artificial Intelligence and Operations Research for

109

CP/LS

Parallel machines with set-up times

Ready timesDues datesSplittable jobsRogue machines

Objectivesmeet due datesminimize setup costs

Page 110: Integration of Artificial Intelligence and Operations Research for

110

Two Phase Cooperation

Phase I - the Primal (Work on first objective)

• Configure and schedule the jobs

– Use constraint based scheduling

Page 111: Integration of Artificial Intelligence and Operations Research for

111

Two Phase CooperationMachines morph into trucks

Page 112: Integration of Artificial Intelligence and Operations Research for

112

Two Phase Cooperation

• Phase 2 - the Dual (Work on second objective)

• Schedule the trucks

– Use Lin, Lin-Kernighan, tabu etc

Page 113: Integration of Artificial Intelligence and Operations Research for

113

Parallel Machines: Cooperative Algorithm Demo

Page 114: Integration of Artificial Intelligence and Operations Research for

114

IC Park Example

• Hoist Scheduling (Rodosek and Wallace)

• The original model is an IP

• The CP model is “the same”

• CP guides search, LP relaxation and CP share pruning duties

• No apparent duality

Page 115: Integration of Artificial Intelligence and Operations Research for

115

Remarks

• One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms

• IP/LS is just around the corner

• IP/IP cooperation is hard because one can’t formulate truly dual views

– either simply not there– or too verbose

counterexamples welcome

Page 116: Integration of Artificial Intelligence and Operations Research for

116

4. DISJUNCTIVE PROGRAMMING

Page 117: Integration of Artificial Intelligence and Operations Research for

117

Disjunctive Linear Programming

• An extension of Mixed Integer Programming

• A union of polyhedral sets (feasible regions) is called a disjunctive set.

Page 118: Integration of Artificial Intelligence and Operations Research for

118

Disjunctive Set

Page 119: Integration of Artificial Intelligence and Operations Research for

119

Disjunctive Linear Programming

• The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem.

• The solution set of the disjunctive programming problem is

Fij

i<M j<N

Page 120: Integration of Artificial Intelligence and Operations Research for

120

Disjunctive Linear Programming Examples

• Semi-continuous variableseither X >= 100or X == 0

• Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable

Page 121: Integration of Artificial Intelligence and Operations Research for

121

Solution Set Inside Initial Region

Page 122: Integration of Artificial Intelligence and Operations Research for

122

Disjunctive Linear Programming Examples - continued

• Bollapragada, Ghattas and HookerTruss structure design problem

Branches directly on alternatives dictated by Hooke’s Law

• WyattDisjunctive programming and mean absolute

deviation models (MAD) for portfolio optimization

Extends Bender’s decomposition to disjunctive linear programs

Page 123: Integration of Artificial Intelligence and Operations Research for

123

Disjunctive Linear Programming continued

• Balas, Cornuejols and CeriaGenerating cuts for disjunctive programming

problems.

• McAloon and TretkoffBasic mathematical results: Optimization and Computational Logic, Wiley

Page 124: Integration of Artificial Intelligence and Operations Research for

124

Disjunctive Linear Programming continued

• Dealing with the disjunctive part requires search.

• This requires an engine which is not available in MIP packages

• Also the linear relaxation is not as tight and the evaluation function is not as faithful

• The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers

• Beringer and DeBacker for MIP

Page 125: Integration of Artificial Intelligence and Operations Research for

125

To Keep It Simple

GERALD + DONALD = ROBERT

An AI classicNewell and Simon

Assignment problem + 1 constraint

Surprisingly hard for MIP solversCPLEX MIP takes 1 minute and 29048 nodes (on

Sun Enterprise) to find a feasible integer solution

Page 126: Integration of Artificial Intelligence and Operations Research for

126

The Disjunctive Program

• One constraint for the equation • 100000 G + … + D = 100000 R + … + T

• For each variable X among G,…,T• X = 0 or X = 1 or … or X = 9

• For each pair X, Y• X Y-1 or Y X-1

Page 127: Integration of Artificial Intelligence and Operations Research for

127

Solution Set: SOME of the Integer Points in the Region

Page 128: Integration of Artificial Intelligence and Operations Research for

128

The Twin Variables for Cooperating Solvers

• Integer variables for the letters 0 g, e, r, a, l, d, o, n, b, t 9

• With continuous doppelgangers0 G, E, R, A, L, D, O, N, B, T 9

Page 129: Integration of Artificial Intelligence and Operations Research for

129

The Variables

• One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver

100000 G + 10000 E + 1000 R + … + D +100000 D + 10000 O + 1000 N + … + D =100000 R + 10000 O + 1000 B + … + T

Page 130: Integration of Artificial Intelligence and Operations Research for

130

The Variables

• One CSP constraint on the integer variables posted to a discrete constraint propagation engine

AllDifferent(g, e, r, a, l, d, r, n, b, t )

Page 131: Integration of Artificial Intelligence and Operations Research for

131

The Search

• Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely

• The branching strategy is guided by the linear relaxation on the continuous variables

• if there is a non-integral variable X, branch on it

X floor(X*) or

X ceil(X*)

Page 132: Integration of Artificial Intelligence and Operations Research for

132

The Search

• If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack

• If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack

Page 133: Integration of Artificial Intelligence and Operations Research for

133

The Search

• New wrinkle• The solution to the linear relaxation might

have all variables integral - but the AllDifferent constraint can be violated by this set of values

• In this case, branch to keep them apart• either X Y - 1• or Y X - 1

Page 134: Integration of Artificial Intelligence and Operations Research for

134

The Variables

void main(){

IlcInitFloat(); IlcManager m(IlcNoEdit);

IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9),G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9);

IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T);

// Continued on next slide

Page 135: Integration of Artificial Intelligence and Operations Research for

135

The Constraints

m.add(IlcAllDiff(vars,IlcWhenValue));

IlcLinOpt simplex(m); simplex.add(

100000*R + 10000*O + 1000*B + 100*E + 10*R + T==

100000*G + 10000*E + 1000*R + 100*A + 10*L + D +

100000*D + 10000*O + 1000*N + 100*A + 10*L + D ,

IlcTrue // Post to Solver as well);

Page 136: Integration of Artificial Intelligence and Operations Research for

136

The Search for solutions

m.add(Generate(m,simplex,vars)); // Search strategy

if (m.nextSolution()) { // Find a solutionm.out() << " solution found " << endl;;

}m.printInformation();m.end();

}

Page 137: Integration of Artificial Intelligence and Operations Research for

137

Branch if a variable is non-integer

ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex);

if (varIndex >= 0) // There is a non-integer variablereturn IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex),

this);

Page 138: Integration of Artificial Intelligence and Operations Research for

138

Is integer relaxation a solution ?

IlcManager m = getManager();if(m.solve(TestIntegerRelaxation(m,simplex)))

return 0;

Page 139: Integration of Artificial Intelligence and Operations Research for

139

Find two variables with same value

IlcInt j;for(i=0;i<vars.getSize()-1;i++) {

if (vars[i].isBound()) continue; // Can’t both be boundIlcInt n = simplex.nearest(simplex.getCurrentValue(vars[i]));for(j=i+1;j<vars.getSize();j++) {

IlcInt m =

simplex.nearest(simplex.getCurrentValue(vars[j]));if (m == n) break;

} if (j< vars.getSize()) break;

}

Page 140: Integration of Artificial Intelligence and Operations Research for

140

Branch to push them apart

// j and i are the indices of two variables with same current value

return IlcAnd(

IlcOr(Smaller(m,vars[i],vars[j],simplex),Smaller(m,vars[j],vars[i],simplex)),

this // Recursion);

}

Page 141: Integration of Artificial Intelligence and Operations Research for

141

Pushing two variables apart

ILCGOAL3(Smaller,IlcIntVar,x,IlcIntVar,y,IlcSimplex,simplex){

simplex.add(x <= y-1,IlcTrue);return 0;

}

Page 142: Integration of Artificial Intelligence and Operations Research for

142

Testing the integer relaxation

ILCGOAL1(TestIntegerRelaxation, IlcSimplex, simplex){

simplex.trySolution();return 0;

}

Page 143: Integration of Artificial Intelligence and Operations Research for

143

Results

• ILOG Solver/Planner finds a solution in 6 nodes (.29 seconds on laptop)

• Straightforward ILOG Solver finds a solution in 8024 nodes (1.8 seconds on a laptop)

• Again, CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution

Page 144: Integration of Artificial Intelligence and Operations Research for

144

Example: The Dutch TrainsScheduling intercity trains

Amsterdam,Rotterdam,Roosendaal,Vlissengen

Without coupling constraints, multi-commodity integer flow problem

With coupling constraints, a DLP with an integer relaxation

Additional logic handled directly in 2LP with CPLEXDisjunctive Programming and Cooperating Solvers,

CSTS 98 (Kluwer, edited by D. Woodruff)

Page 145: Integration of Artificial Intelligence and Operations Research for

Conclusions

• CP and MIP are powerful techniques that can solve many combinatorial problems

• Each has preferred formulations• Can get even greater benefits when combining

CP and IP algorithms

Page 146: Integration of Artificial Intelligence and Operations Research for

146

Recent and Current Work

Beaumont

Beringer, DeBacker

Balas, Ceria, Cornuejols.

Wallace, Rodosek, Schrimpf

Heipke, Colombani

Bockmayr

McAloon, Tretkoff, Wetzel

Page 147: Integration of Artificial Intelligence and Operations Research for

147

III. Exploiting Randomization to Solve Hard Combinatorial

Problems

Page 148: Integration of Artificial Intelligence and Operations Research for

148

Background

Combinatorial search methods often exhibita remarkable variability in performance. It is common to observe significant differences between:

- different heuristics- same heuristic on different instances- different runs of same heuristic with different seeds (stochastic methods)

Page 149: Integration of Artificial Intelligence and Operations Research for

149

Main Claim

One can take advantage of the extreme variability of combinatorial search methods:

One can One can improve the performance of a improve the performance of a deterministic complete methoddeterministic complete method, by , by introducing a introducing a stochastic elementstochastic element, while , while maintaining maintaining completeness.completeness.

We’ll explain We’ll explain WHYWHY that is the case. that is the case.

Page 150: Integration of Artificial Intelligence and Operations Research for

150

A Structured Benchmark Domain for Studying the

Distributions of Search Methods Stochasticity in Search ProceduresIntriguing Properties of Complete Backtrack Style AlgorithmsConsequences for Algorithm Design - Rapid Randomized RestartsPortfolio of Algorithms

Page 151: Integration of Artificial Intelligence and Operations Research for

151

Structured Benchmark Domain

Page 152: Integration of Artificial Intelligence and Operations Research for

152

Study of local and systematic search methods has been driven by:Random instance distributions (Hogg et al. 96). Limitation: lack of structure that characterizes realistic problems;

Highly structured problems (Fujita at al. 93). Limitation: “too much” structure.

We propose a benchmark domain that We propose a benchmark domain that bridges the gap between purely random bridges the gap between purely random instances and highly structured problems.instances and highly structured problems.

Background

Gomes and Selman 1997 - Proc. AAAI-97

Page 153: Integration of Artificial Intelligence and Operations Research for

153

Defn.: a pair (Q, *) where Q is a set, and * is a binary operation on Q such that a * x = b ; y * a = bare uniquely solvable for every pair of elements a,b in Q.

The multiplication table of its binary operation defines a latin square (i.e., each element of Q appears exactly oncein each row/column).

Example:Quasigroup of order 4

Quasigroups

Page 154: Integration of Artificial Intelligence and Operations Research for

154

Given a partial latin square, can it be completed?

Example:

Quasigroup Completion Problem (QCP)

Page 155: Integration of Artificial Intelligence and Operations Research for

155

Quasigroup Completion Problem A Framework for Studying Search

NP-Complete (Colbourn 1983, 1984; Anderson 1985).

Has a structure not found in random instances.

Leads to interesting search problems when structure is perturbed.

The study of this problem led us to identifythe unusual distributions of combinatorial search (Gomes, Selman & Crato --- CP97)

Page 156: Integration of Artificial Intelligence and Operations Research for

156

Aside: Applications of Quasigroups

Design of statistical experimentseliminating data dependencies

Scheduling/Timetabling (Anderson 1992)

completing a schedule given a set of pre-defined events

Automated theorem proving (Fujita et al. 1993)

existence vs. non-existence of quasigroups with intricate mathematical properties

Page 157: Integration of Artificial Intelligence and Operations Research for

157

Example: Scheduling of Drug Experiment

Given 5 different drugs, test the effects of the different medications on 5 different subjects over different days of the week.

Use constraint:No two people get same brand on the same day (eliminate bias for day of the week).

Page 158: Integration of Artificial Intelligence and Operations Research for

158

Quasigroup Completion SU

BJE

CT

DAYMon. Tues. Wed. Thurs. Fri.

Tim

Sue

Frank

Teresa

Todd

Tylenol Aleve Bayer ExhedrinExhedrin Advil

Aleve Bayer Exhedrin Advil Tylenol

Bayer Exhedrin AdvilAdvil Tylenol Aleve

Exhedrin AdvilAdvil Tylenol Aleve Bayer

Advil Tylenol Aleve Bayer Exhedrin

(*) Pre-assigned(*) Pre-assigned

Page 159: Integration of Artificial Intelligence and Operations Research for

159

QCP has a natural formulation as a ConstraintSatisfaction Problem

variable for each NxN entryconstraints capture row/column requirementvariable assignments capture pre-assigned values

Page 160: Integration of Artificial Intelligence and Operations Research for

160

How does the difficulty of QCP vary with the fraction of pre-assignment?

Page 161: Integration of Artificial Intelligence and Operations Research for

161

Fraction of pre-assignment

Med

ian

num

ber

of b

ackt

rack

s (lo

g)

Overconstrained areaUnderconstrained

area

Critically constrained area

Page 162: Integration of Artificial Intelligence and Operations Research for

162

Complexity Graph shows (up to order 20):

curve peaks around 42% of pre-assignment ---

critically constrained area.critically constrained area.

under-constrainedunder-constrained and over-over-constrainedconstrained areas are easier.

Page 163: Integration of Artificial Intelligence and Operations Research for

163

Directly related to the peak incomputational difficulty is the so-called phase transition graph forthe QCP problem.

Page 164: Integration of Artificial Intelligence and Operations Research for

164Fraction of pre-assignment

Frac

tion

of U

nsol

ved

case

s

Almost all unsolvable area

Almost all solvable area

Phase transition area

Page 165: Integration of Artificial Intelligence and Operations Research for

165

Phase Transition

QCP Phase Transition --- threshold phenomenonthreshold phenomenon from almost all solvable to almost all unsolvablefrom almost all solvable to almost all unsolvable --- occurs around 42% of preassignment.

It’s called a phase transition because of the closerelation to state transition phenomena studied inphysics, such as the melting of a solid into aliquid.

Page 166: Integration of Artificial Intelligence and Operations Research for

166

Exploiting Structure

Page 167: Integration of Artificial Intelligence and Operations Research for

167

Forward Checking Arc Consistency on binary constraints

Exploiting Structure in QCP

Page 168: Integration of Artificial Intelligence and Operations Research for

168

Arc Consistency on Binary Constraints

Further Exploiting Structure in QCP

Shaw, Stergiou and Walsh - ECAI98

General Arc Consistency on all different

constraints

Page 169: Integration of Artificial Intelligence and Operations Research for

169

Enforcing General Arc Consistency on All Different Constraints

• Beautiful example of integration of AI/OR techniques for a well defined sub-problem

• Propagation uses Maximum Matching problem (particular case of Network Flow problems which have polynomial time complexity)

Regin - AAAI94

Page 170: Integration of Artificial Intelligence and Operations Research for

170

Further Exploiting Structure in QCP

By enforcing general arc consistency on all different constraints problems up to order 50 could be solved!

Shaw, Stergiou and Walsh - ECAI98 Regin - AAAI94

Page 171: Integration of Artificial Intelligence and Operations Research for

171

Stochasticity in Search ProceduresStochasticity in Search Procedures

Page 172: Integration of Artificial Intelligence and Operations Research for

172

Background

Stochastic strategies have been very successful in the area of local search.

Limitation: inherent incomplete nature of local search methods.

We want to explore the addition of a We want to explore the addition of a stochastic element to a systematic search stochastic element to a systematic search procedure without losing completeness.procedure without losing completeness.

Page 173: Integration of Artificial Intelligence and Operations Research for

173

We introduce stochasticity in a backtrack search method by randomlybreaking ties in variable and/or valueselection.

Compare with standard lexicographictie-breaking.

Page 174: Integration of Artificial Intelligence and Operations Research for

174

Randomized Strategies

Strategy Variable sel. Value sel.

DD deterministic deterministic

DR deterministic random

RD random deterministic

RR random random

Page 175: Integration of Artificial Intelligence and Operations Research for

175

Page 176: Integration of Artificial Intelligence and Operations Research for

176

Page 177: Integration of Artificial Intelligence and Operations Research for

177

Page 178: Integration of Artificial Intelligence and Operations Research for

178

Lesson: Randomized tie-breaking can improve performance over a purely deterministic strategy.

Next: But we can obtain a more dramatic advantage from randomization ...

Page 179: Integration of Artificial Intelligence and Operations Research for

179

Cost Distributions

Key Properties:

I Erratic behavior of mean.I Erratic behavior of mean.

II Distributions have “II Distributions have “heavy tailsheavy tails”. ”.

Page 180: Integration of Artificial Intelligence and Operations Research for

180

Median = 1!

samplemean

number of runs

3500!

500

2000

Page 181: Integration of Artificial Intelligence and Operations Research for

181

1

Page 182: Integration of Artificial Intelligence and Operations Research for

182

75%<=30

Number backtracks Number backtracksProp

ortio

n of

cas

es S

olve

d

5%>100000

Page 183: Integration of Artificial Intelligence and Operations Research for

183

Heavy-Tailed Distributions

… … infinite variance … infinite meaninfinite variance … infinite mean

Introduced by Pareto in the 1920’s--- “probabilistic curiosity.”

Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena.

Examples: stock-market, earth-quakes, weather,...

Page 184: Integration of Artificial Intelligence and Operations Research for

184

Decay of Distributions

Standard --- Exponential Decay e.g. Normal:

Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy:

Pr[ ] , ,X x Ce x for someC x 2 0 1

Pr[ ] ,X x Cx x 0

Page 185: Integration of Artificial Intelligence and Operations Research for

185Standard Distribution

(finite mean & variance)

Power Law Decay

Exponential Decay

Page 186: Integration of Artificial Intelligence and Operations Research for

186

Normal, Cauchy, and Levy

Normal - Exponential Decay

Cauchy -Power law DecayLevy -Power law Decay

Page 187: Integration of Artificial Intelligence and Operations Research for

187

Tail Probabilities (Standard Normal, Cauchy, Levy)

c Normal Cauchy Levy0 0.5 0.5 11 0.1587 0.25 0.68272 0.0228 0.1476 0.52053 0.001347 0.1024 0.43634 0.00003167 0.078 0.3829

Page 188: Integration of Artificial Intelligence and Operations Research for

188

How to Check for “Heavy Tails”?

Log-Log plot of tail of distributionshould be approximately linear.

Slope gives value of infinite mean and infinite varianceinfinite mean and infinite variance

infinite varianceinfinite variance

1

1 2

Page 189: Integration of Artificial Intelligence and Operations Research for

189

Example of Heavy Tailed Model(Random Walk)

Random Walk:•Start at position 0•Toss a fair coin:

• with each head take a step up (+1)• with each tail take a step down (-1)

X --- number of steps the random walk takes to return to position 0.

Page 190: Integration of Artificial Intelligence and Operations Research for

190

The record of 10,000 tosses of an ideal coin(Feller)

Zero crossing Long periods without zero crossing

Page 191: Integration of Artificial Intelligence and Operations Research for

191

Random Walk

Heavy-tails vs. Non-Heavy-Tails

Normal(2,1000000)

Normal(2,1)

O,1%>200000

50%

2

Median=2

1-F(

x)U

nsol

ved

frac

tion

X - number of steps the walk takes to return to zero (log scale)

Page 192: Integration of Artificial Intelligence and Operations Research for

192

466.0

319.0

153.0

Number backtracks (log)

1-F(

x)U

nsol

ved

frac

tion

1 => Infinite mean

Heavy-tails in QCP Domain

Page 193: Integration of Artificial Intelligence and Operations Research for

193

The Log-Log plot shows a linear relationover many orders of magnitude. This isclear evidence of heavy-tailed behavior.

Page 194: Integration of Artificial Intelligence and Operations Research for

194

Page 195: Integration of Artificial Intelligence and Operations Research for

195

Page 196: Integration of Artificial Intelligence and Operations Research for

196

Heavy Tailed Cost Distribution

0.1

1

1 10 100 1000 10000 100000

log( Backtracks )

log(

1 -

F(x)

)

Page 197: Integration of Artificial Intelligence and Operations Research for

197

The Log-Log plot shows a linear relationover many orders of magnitude. This isclear evidence of heavy-tailed behavior.

Page 198: Integration of Artificial Intelligence and Operations Research for

198

By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution.

Right-hand side: non-negligible fraction of very long runsLeft-hand side: non-negligible fraction of very short runs

Page 199: Integration of Artificial Intelligence and Operations Research for

199

70%>250000

15!

1%<=650!

Sports Scheduling

Number backtracks (log)

Cum

ulat

ive

Dis

trib

utio

n Fu

nctio

n

Page 200: Integration of Artificial Intelligence and Operations Research for

200Standard Distribution

(finite mean & variance)

Power Law Decay

Exponential Decay

Also, heavy tails on left. (High probability of very short runs.)

Page 201: Integration of Artificial Intelligence and Operations Research for

201

Consequence for algorithm design:

Use rapid restarts or parallel / inter-leaved runs

Super linear speedups!!!

Page 202: Integration of Artificial Intelligence and Operations Research for

202

X XX XX

solved10 101010 10

Sequential: 50 +1 = 51 secondsParallel: 10 machines --- 1 second 51 x speedup

Super-linear Speedups

Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup

Page 203: Integration of Artificial Intelligence and Operations Research for

203

Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution.

RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left.

Restarts provably eliminate heavy tails (Gomes, Selman & Crato )

Page 204: Integration of Artificial Intelligence and Operations Research for

204

Sketch of proof of elimination of heavy tails

Let’s truncate the search procedure after m backtracks.

Probability of solving problem with truncated version:

Run the truncated procedure and restart it repeatedly.

pm X m Pr[ ]

X numberof backtracks to solve the problem

Page 205: Integration of Artificial Intelligence and Operations Research for

205

Y total number backtracks with restarts

F Y y pmY m c e c y

Pr[ ] ( ) /1 1

2

Number of starts Y m Geometric pmRe / ~ ( )

Y - does not have Heavy Tails

Page 206: Integration of Artificial Intelligence and Operations Research for

206

Restarts

70%unsolved

250~ 62.5 restarts

1-F(

x)U

nsol

ved

frac

tion

Number backtracks (log)

Page 207: Integration of Artificial Intelligence and Operations Research for

207

Example of Rapid Restart Speedup(planning)

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( bac

ktra

cks

)

20

2000 ~100 restarts

Cutoff (log)

Num

ber

back

trac

ks (l

og)

~10 restarts100000

Page 208: Integration of Artificial Intelligence and Operations Research for

208

Deterministic

Logistics Planning 108 mins. 95 sec.Scheduling 14 411 sec 250 sec

(*) not found after 2 days

Scheduling 16 ---(*) 1.4 hoursScheduling 18 ---(*) ~18 hrsCircuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.

Summary Results

R3

Page 209: Integration of Artificial Intelligence and Operations Research for

209

Our results provide the first indication of heavy-tailed distri-butions in a computational model.

Overall insight:Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy.

Page 210: Integration of Artificial Intelligence and Operations Research for

210

Heavy-Tailed Distributionsin Other Domains

Quasigroup Completion Problem

Graph Coloring

Logistic Planning

Circuit Synthesis

Gomes, Selman, and Crato 1997; Gomes, Selman, McAloon, and Tretkoff 1998; Gomes,Kautz, and Selman 1998;

Page 211: Integration of Artificial Intelligence and Operations Research for

211

Deterministic

Logistics Planning 108 mins. 95 sec.Scheduling 14 411 sec 250 sec

(*) not found after 2 days

Scheduling 16 ---(*) 1.4 hoursScheduling 18 ---(*) ~18 hrsCircuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.

Summary Results

R3

Page 212: Integration of Artificial Intelligence and Operations Research for

212

Rapid Restart Speedup

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( bac

ktra

cks

)

Page 213: Integration of Artificial Intelligence and Operations Research for

213

Our results provide the first indication of heavy-tailed distri-butions in a computational model.

Overall insight:Overall insight:Randomized tie-breaking withrapid restarts gives powerfulsearch strategy.

Page 214: Integration of Artificial Intelligence and Operations Research for

214

Heavy-Tailed Distributionsin Other Domains

Quasigroup Completion Problem

Graph Coloring

Logistic Planning

Circuit Synthesis

Gomes, Selman, and Crato 1997 - Proc. CP97; Gomes, Selman, McAloon, and Tretkoff 1998 - Proc AIPS98; Gomes,Kautz, and Selman 1998 - Proc. AAAI98.

Page 215: Integration of Artificial Intelligence and Operations Research for

215

Algorithm Portfolio Design

Gomes and Selman 1997 - Proc. UAI-97; Gomes, Selman, and Crato 1997 - Proc. CP97.

Page 216: Integration of Artificial Intelligence and Operations Research for

216

Motivation

The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances.

Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths.

Page 217: Integration of Artificial Intelligence and Operations Research for

217

Branch & Bound:Best Bound vs. Depth First Search

Page 218: Integration of Artificial Intelligence and Operations Research for

218

Branch & Bound(Randomized)

Standard OR approach for solving Mixed Integer Programs (MIPs)• Solve linear relaxation of MIP• Branch on the integer variables for which the solution of the LP relaxation is non-integer:

apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value)

• Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values

Page 219: Integration of Artificial Intelligence and Operations Research for

219

Branch & BoundDepth First vs. Best bound

Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound

(standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic

Depth-first - often quickly reaches an integer solution(may take longer to produce an overall optimal value)

Page 220: Integration of Artificial Intelligence and Operations Research for

220

Portfolio of Algorithms

A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.Goal: to improve on the performance of the component algorithms in terms of:

expected computational cost“risk” (variance)

Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.

Page 221: Integration of Artificial Intelligence and Operations Research for

221

Depth-first vs. Best-bound(logistics planning)

Number of nodes

Cum

ulat

ive

Freq

uenc

ies

Depth-First~50%

Best-Bound

~30%

Page 222: Integration of Artificial Intelligence and Operations Research for

222

Depth-First and Best and Bound do not dominate each other overall.

Page 223: Integration of Artificial Intelligence and Operations Research for

223

Heavy-tailed behavior of Depth-first

Page 224: Integration of Artificial Intelligence and Operations Research for

224

Portfolio for heavy-tailed search procedures (2 processors)

0 DF / 2 BB

2 DF / 0 BB

Standard deviation of run time of portfolios

Expe

cted

run

tim

e of

por

tfol

ios

Page 225: Integration of Artificial Intelligence and Operations Research for

225

Portfolio for heavy-tailed search procedures (6 processors)0 DF / 6 BB

6 DF / 0BB

Standard deviation of run time of portfoliosExpe

cted

run

tim

e of

por

tfol

ios

5 DF / 1BB

3 DF / 3 BB4 DF / 2 BB

Efficient set

Page 226: Integration of Artificial Intelligence and Operations Research for

226

Portfolio for heavy-tailed search procedures (20 processors)

0 DF / 20 BB

20 DF / 0 BB

Standard deviation of run time of portfolios

Expe

cted

run

tim

e of

por

tfol

ios

Page 227: Integration of Artificial Intelligence and Operations Research for

227

Portfolio for heavy-tailed search procedures (2-20 processors)

Page 228: Integration of Artificial Intelligence and Operations Research for

228

A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena.

Page 229: Integration of Artificial Intelligence and Operations Research for

229

Summary of Randomization Considered randomized backtrack search.

Showed Heavy-Tailed Distributions.

Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and

scheduling problems.

Portfolio of Algorithms for cases where no single heuristic dominates

Page 230: Integration of Artificial Intelligence and Operations Research for

230

Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions.

Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs

Experimentally validated on previously unsolved planning and scheduling problems.

Portfolio of Algorithms for cases where no single heuristic dominates

Page 231: Integration of Artificial Intelligence and Operations Research for

231

IV. CONCLUSIONS

Page 232: Integration of Artificial Intelligence and Operations Research for

232

Important Themes in OR

Linear Programming (Mixed) Integer Programming

Exploit Structure e.g., Network Flow Problems

Duality very elegant theory in LP sensitivity analysis

Page 233: Integration of Artificial Intelligence and Operations Research for

233

Opportunities for Integration of AI/OR

OR methods:Have focused on tractable representations (LP)Have demonstrated the ability to identify optimal and locally optimal solutionsLIMITATION: Restricted to rigid models with limited expressive powerAI methods:Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop.LIMITATION: Rich representations in general lead to intractable problemsCHALLENGE: good representations / fast & good solutions

,

Page 234: Integration of Artificial Intelligence and Operations Research for

234

Opportunities for Integration of AI/OR

AI methods are becoming competitive

AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples:

SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches.ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX.OR people have acknowledge the benefits of combining OR and AI methods

Page 235: Integration of Artificial Intelligence and Operations Research for

235

Opportunities for Integration of AI/OR

Exploiting Duality in CSP frameworks

Exploiting Randomization

Hybrid Solvers

Page 236: Integration of Artificial Intelligence and Operations Research for

236

Opportunities for Integration of AI/OR

Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX

local constraint propagation - local consistency algs global constraint propagation - LP relaxations

Only a hybrid approach could prove optimality, e.g.:Hoist scheduling (Rodosek & Wallace 1998)Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998)

Page 237: Integration of Artificial Intelligence and Operations Research for

237

Updated version of tutorial slides

www.cs.cornell.edu/gomes/

TalksDemos

Page 238: Integration of Artificial Intelligence and Operations Research for

238

Appendix

Page 239: Integration of Artificial Intelligence and Operations Research for

239

Portfolio of Algorithms

A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.

A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost.

The standard deviation of the portfolio is a measure of the risk inherent to the portfolio.

Page 240: Integration of Artificial Intelligence and Operations Research for

240

Portfolio of Algorithms

Goal: to improve on the performance of the component algorithms in terms of:

expected computational cost“risk” (variance)

Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.

Page 241: Integration of Artificial Intelligence and Operations Research for

241

AppendixPortfolio of Algorithms

Goal: to improve on the performance of the component algorithms in terms of:

expected computational cost;risk;

Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk.

Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk.

Page 242: Integration of Artificial Intelligence and Operations Research for

242

Appendix Portfolio of Two Algorithms

Let us consider the random variables:

A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist;

A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist;

Page 243: Integration of Artificial Intelligence and Operations Research for

243

Appendix Portfolio of Two Algorithms

Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2).

Let us consider the random variable:

X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist;

Page 244: Integration of Artificial Intelligence and Operations Research for

244

AppendixPortfolio of Two Algorithms

Given N processors, and

P[X x]Nii 1

NP[A1 x]i P[A1 x](N i)

n N1 n2 0

Page 245: Integration of Artificial Intelligence and Operations Research for

245

Appendix Portfolio of Algorithms

Given N processors, such that and n1n N n2 1 ,

0 1 n N

P[X x]n1

i'i' 0

n1P[A1 x]i

'P[A1 x](n1 i' )

i

N

1

n2

i' 'P[A2 x]i

' 'P[A2 x](n2 i' ' )

i i i' ' ' and the term in the summation is 0 when 2'',0'' nii

Page 246: Integration of Artificial Intelligence and Operations Research for

246

Preliminary Research on Structure of Search Spaces

Page 247: Integration of Artificial Intelligence and Operations Research for

247

Fringe of Search Tree

Page 248: Integration of Artificial Intelligence and Operations Research for

248

Fractal Dimension

Page 249: Integration of Artificial Intelligence and Operations Research for

249

Fractal Dimension When plotting the length of a curve as a function of the measuring tool on a log-log plot, one obtains a

linear relationship:

L - the measured length; s - length of the yardstick;

c and d are constants;

Mandelbrot introduced the fractal dimension D = d +1; A straight line has D = 1.0; The coast of Britain has fractal dimension 1.22;

The higher D the more fractal the curve is.

dscL )/1(

Page 250: Integration of Artificial Intelligence and Operations Research for

250

Heavy-Tailed Behavior vs Non-heavy-tailed behavior