decision camp 2014 - charles forgy - affecting rules performance

55
Factors Affecting Rule Performance Charles L. Forgy October, 2014

Upload: decision-camp

Post on 16-Jun-2015

133 views

Category:

Technology


0 download

DESCRIPTION

Factors affecting rules performance

TRANSCRIPT

Page 1: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Factors AffectingRule Performance

Charles L. Forgy

October, 2014

Page 2: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Outline

• Rete: A 10,000 foot view.

• Dealing With Expensive Rules.

• Other Considerations.

• Conclusion.

Page 3: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

RETE: A 10,000 FOOT VIEW.

Page 4: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Reactive Rules

We are only going to consider reactive rules.

• Inference rules.

• Monitoring rules.

Page 5: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Rete

Most rule engines use some form of Rete to handle reactive rules.

• Advisor, Clips, Drools, Ilog, Jess, Opsj, Smarts, Tibco ...

Page 6: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

OPSJ

rule marking

if {

st: Stage(st.value == "marking”);

j1: Junction(j1.visited == "now”);

e1: Edge(e1.p2 == j1.base_point);

j2: Junction(j2.base_point == e1.p1,

j2.visited == "yes”);

} do {

j2.visited = "check";

update(j2);

}

Page 7: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

The Recognize-Act Cycle

1. Match: The engine determines which rules have satisfied conditions.

2. Conflict Resolution: One or more rules are selected for execution.

3. Action: The Then parts of the selected rules are executed.

The Recognize-Act cycle repeats until there are no satisfied rules.

Page 8: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

The Problem for the Engine

• The working memory may contain thousands of objects.

• Systems may contain hundreds to thousands of rules.

• A condition part may contain several conditions, each of which has to match an object.

Page 9: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Saving Information

• Typically, only a few objects are added to or removed from Working Memory on each cycle.

• So, most of the information that was computed for any cycle N is still useful on cycle N+1.

Page 10: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

The Task

Given the state of the system on cycle N, to determine the state for cycle N+1:

• Delete the specific pieces of information that are no longer correct.

• Add in the new pieces of information.

Page 11: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

More Specifically

Given the state of the system on cycle N, to determine the state for cycle N+1:

• For each object in working memory that was deleted or changed, remove every piece of match information referring to that object.

• For each object in working memory that was changed or added, add any match information that can make use of that object.

Page 12: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Constraints Within Conditions

rule marking

if {

st: Stage(st.value == "marking”);

j1: Junction(j1.visited == "now”);

e1: Edge(e1.p2 == j1.base_point);

j2: Junction(j2.base_point == e1.p1,

j2.visited == "yes”);

} do {

j2.visited = "check";

update(j2);

}

Page 13: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Constraints Between Conditions

rule marking

if {

st: Stage(st.value == "marking");

j1: Junction(j1.visited == "now");

e1: Edge(e1.p2 == j1.base_point);

j2: Junction(j2.base_point == e1.p1,

j2.visited == "yes");

} do {

j2.visited = "check";

update(j2);

}

Page 14: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Handling Inserted Objects

To find the rules that might be affected by a changed working memory object, Rete

1. Uses the constraints within each condition to find the conditions that match at least that information. (“Alpha” tests.)

2. Uses the constraints between conditions to determine whether the changed object fully matches. (“Beta” tests.)

Page 15: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Saving Information

Rete saves all the information that it computes while processing each rule.

• It keeps track of the objects that match each condition based on only Alpha tests.

• For each initial sequence of conditions in a condition part, it keeps track of the lists of objects that match those conditions when Beta tests are also applied.

Page 16: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Order Rule

rule order

if {

low: Integer;

mid: Integer(

mid.intValue() > low.intValue());

hi: Integer(

hi.intValue() > mid.intValue());

} do {

delete(mid);

}

Page 17: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Execute:insert(new Integer(0));

insert(new Integer(1));

Alpha matches:low: Integer(0), Integer(1)

mid: Integer(0), Integer(1)

hi: Integer(0), Integer(1)

Page 18: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Alpha Matches

rule order

if {

low: Integer;

{Integer(0), Integer(1)}

mid: Integer(

mid.intValue() > low.intValue());

{Integer(0), Integer(1)}

hi: Integer(

hi.intValue() > mid.intValue());

{Integer(0), Integer(1)}

} do {

delete(mid);

}

Page 19: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Execute:insert(new Integer(0));

insert(new Integer(1));

Alpha matches:low: Integer(0), Integer(1)

mid: Integer(0), Integer(1)

hi: Integer(0), Integer(1)

Beta matches:[low, mid]: [Integer(0), Integer(1)]

[low, mid, hi]:

Page 20: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Beta Matches

rule order

if {

low: Integer;

mid: Integer(

mid.intValue() > low.intValue());

{[Integer(0), Integer(1)]}

hi: Integer(

hi.intValue() > mid.intValue());

{ }

} do {

delete(mid);

}

Page 21: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Execute:insert(new Integer(2));

Alpha matches:low: Integer(0), Integer(1), Integer(2)

mid: Integer(0), Integer(1), Integer(2)

hi: Integer(0), Integer(1), Integer(2)

Beta matches:[low, mid]: [Integer(0), Integer(1)],

[Integer(0), Integer(2)],

[Integer(1), Integer(2)]

[low, mid, hi]: [Integer(0), Integer(1), Integer(2)]

Page 22: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Beta Matches

rule order

if {

low: Integer;

mid: Integer(mid.intValue() > low.intValue());

{[Integer(0), Integer(1)],

[Integer(0), Integer(2)],

[Integer(1), Integer(2)]}

hi: Integer(hi.intValue() > mid.intValue());

{[Integer(0), Integer(1), Integer(2)]}

} do {

delete(mid);

}

Page 23: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Handling Deleted Objects

Processing deleted objects is fast. The engine keeps track of the saved data which each object is involved in, so when the object is deleted, the engine can directly remove all the stored match information.

Page 24: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Order Rule

rule order

if {

low: Integer;

mid: Integer(

mid.intValue()>low.intValue());

hi: Integer(

hi.intValue()>mid.intValue());

} do {

delete(mid);

}

Page 25: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

After rule fires.

Alpha matches:low: Integer(0), Integer(2)

mid: Integer(0), Integer(2)

hi: Integer(0), Integer(2)

Beta matches:[low, mid]: [Integer(0), Integer(2)]

[low, mid, hi]:

Page 26: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Important Points

• Rete saves state, processing only the changed objects each cycle.

• For each initial sequence of conditions in a condition part, Rete keeps track of the lists of objects that match those conditions.

Page 27: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

DEALING WITH EXPENSIVE RULES

Page 28: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Finding Expensive Rules

The expensive rules are not always the ones the developer suspects; profiling is essential

Rule engines have very different profiling tools.

– Must check the documentation for the engine you are using.

Page 29: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

OPSJ – Java Profiler

Page 30: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

The Slowest Rule

rule start_visit_3_junction

if {

stg: Stage(stg.value == "labeling");

junct: Junction(junct.kind == "3j",

junct.visited == "no");

} do {

junc.visited = “now”;

stg.value = visiting_3j";

update(junc, stg);

}

Page 31: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Excessive Beta Matches

In many cases, expensive rules are caused by creating excessive beta matches.

Typical issues:

– Computing more information than is needed.

– Throwing away information and recomputing it.

Page 32: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

How Many Junctions Are There?

rule start_visit_3_junction

if {

stg: Stage(stg.value == "labeling");

junct: Junction(junct.kind == "3j",

junct.visited == "no");

} do {

junc.visited = “now”;

stg.value = “visiting_3j”;

update(junc, stg);

}

Page 33: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Updating the Stage CausesBeta Matches to be Discarded

rule start_visit_3_junction

if {

stg: Stage(stg.value == "labeling“);

junct: Junction(junct.kind == "3j",

junct.visited == "no");

} do {

junc.visited = “now”;

stg.value = “visiting_3j”;

update(junc, stg);

}

Page 34: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Option 1

rule start_visit_3_junction

if {

stg: Stage(stg.value == "labeling");

junct: Junction(junct.kind == "3j",

junct.visited == "no");

} do {

junc.visited = “now”;

update(junc);

insert(new Stage(“visiting_3j”));

}

Page 35: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Option 1: Continued

To accommodate this change, other rules must be modified to use insert/delete of stages as well.

Page 36: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Option 2

rule start_visit_3_junction

if {

stg: Stage(stg.value == "labeling");

junct: from j:Junction(j.kind == "3j",

j.visited == "no")

TakeAny;

} do {

junc.visited = “now”;

stg.value = “visiting_3j”;

update(junc, stg);

}

Page 37: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Effects of Changes

• Option 1 computes a lot of state information, but does not throw it away while it is still useful.

• Option 2 computes only the state that is needed at each step.

Page 38: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

OPSJ – Java Profiler

Page 39: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Initial Boundary Rules

rule initial_boundary_junction_L

if {

stg: Stage(stg.value ==

"find_initial_boundary");

junct: Junction(junct.kind == "2j");

edge1: Edge (edge1.p1 == junct.base_point,

edge1.p2 == junct.p1);

edge2: Edge (edge2.p1 == junct.base_point,

edge2.p2 == junct.p2);

!j2: Junction (j2.base_point >

junct.base_point);

} do { . . . }

Page 40: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Expensive Idioms

Another common problem is using expensive idioms in rule conditions.

Page 41: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Maximize Idiom

rule initial_boundary_junction_L

if {

stg: Stage(stg.value ==

"find_initial_boundary");

junct: Junction(junct.kind == "2j");

edge1: Edge (edge1.p1 == junct.base_point,

edge1.p2 == junct.p1);

edge2: Edge (edge2.p1 == junct.base_point,

edge2.p2 == junct.p2);

!j2: Junction(j2.base_point >

junct.base_point);

} do { . . . }

Page 42: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Why This Is Expensive

• When comparing a collection of beta matches with a collection of alpha matches, all tests are not equally expensive.

• Equality tests can be handled quite efficiently.

• Order tests (<, >, etc.) take more time to evaluate.

Page 43: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

The Slow Comparison

rule initial_boundary_junction_L

if {

stg: Stage(stg.value ==

"find_initial_boundary");

junct: Junction(junct.kind == "2j");

edge1: Edge (edge1.p1 == junct.base_point,

edge1.p2 == junct.p1);

edge2: Edge (edge2.p1 == junct.base_point,

edge2.p2 == junct.p2);

!j2: Junction(j2.base_point >

junct.base_point);

} do { . . . }

Page 44: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Avoiding the Idiom

rule initial_boundary_junction_L

if {

stg: Stage(stg.value ==

"find_initial_boundary");

junct: from j:Junction

TakeMax(j.base_point);

test (junct.kind == "2j");

edge1: Edge (edge1.p1 == junct.base_point,

edge1.p2 == junct.p1);

edge2: Edge (edge2.p1 == junct.base_point,

edge2.p2 == junct.p2);

} do { . . . }

Page 45: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

OTHER CONSIDERATONS

Page 46: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

When To Optimize

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.”

- Donald Knuth

Page 47: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

When To Optimize

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

“Yet we should not pass up our opportunities in that critical 3%.”

- Donald Knuth

Page 48: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

NP-Complete Problems

There is a class of problems known as “NP-Complete” problems.

There is no known algorithm that can solve NP-Complete problems in polynomial time.

Page 49: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

The Bad News

Rule matching is an NP-Complete problem.

Page 50: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

But…

Processing SQL queries is also an NP Complete problem.

Page 51: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

CONCLUSIONS

Page 52: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Understand Rule Engine Performance

• Rete is a state-saving algorithm; on each cycle it maps from working memory changes to changes in rule matching information.

• Usually, the constant (“alpha”) tests are not a problem.

• The variable (“beta”) tests can be a problem.

Page 53: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Profiling

• Generally, only a few rules will impact performance significantly.

• It is important to profile the rules, and not guess at the culprits.

Page 54: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Dealing With Expensive Rules

Typical problems to look for:

– Computing more information than is needed.

– Throwing away information that is still useful.

– Using expensive idioms.

There are problems that are inherently expensive. (But we can hope they are rare.)

Page 55: Decision CAMP 2014 - Charles Forgy - Affecting rules performance

Thank You