decision camp 2014 - charles forgy - affecting rules performance

Factors AffectingRule Performance

Charles L. Forgy

October, 2014

Outline

• Rete: A 10,000 foot view.

• Dealing With Expensive Rules.

• Other Considerations.

• Conclusion.

RETE: A 10,000 FOOT VIEW.

Reactive Rules

We are only going to consider reactive rules.

• Inference rules.

• Monitoring rules.

Rete

Most rule engines use some form of Rete to handle reactive rules.

• Advisor, Clips, Drools, Ilog, Jess, Opsj, Smarts, Tibco ...

OPSJ

rule marking

if {

st: Stage(st.value == "marking”);

j1: Junction(j1.visited == "now”);

e1: Edge(e1.p2 == j1.base_point);

j2: Junction(j2.base_point == e1.p1,

j2.visited == "yes”);

} do {

j2.visited = "check";

update(j2);

}

The Recognize-Act Cycle

1. Match: The engine determines which rules have satisfied conditions.

2. Conflict Resolution: One or more rules are selected for execution.

3. Action: The Then parts of the selected rules are executed.

The Recognize-Act cycle repeats until there are no satisfied rules.

The Problem for the Engine

• The working memory may contain thousands of objects.

• Systems may contain hundreds to thousands of rules.

• A condition part may contain several conditions, each of which has to match an object.

Saving Information

• Typically, only a few objects are added to or removed from Working Memory on each cycle.

• So, most of the information that was computed for any cycle N is still useful on cycle N+1.

The Task

Given the state of the system on cycle N, to determine the state for cycle N+1:

• Delete the specific pieces of information that are no longer correct.

• Add in the new pieces of information.

More Specifically

Given the state of the system on cycle N, to determine the state for cycle N+1:

• For each object in working memory that was deleted or changed, remove every piece of match information referring to that object.

• For each object in working memory that was changed or added, add any match information that can make use of that object.

Constraints Within Conditions

rule marking

if {

st: Stage(st.value == "marking”);

j1: Junction(j1.visited == "now”);



j2.visited == "yes”);

} do {


update(j2);

}

Constraints Between Conditions

rule marking

if {

st: Stage(st.value == "marking");

j1: Junction(j1.visited == "now");



j2.visited == "yes");

} do {


update(j2);

}

Handling Inserted Objects

To find the rules that might be affected by a changed working memory object, Rete

1. Uses the constraints within each condition to find the conditions that match at least that information. (“Alpha” tests.)

2. Uses the constraints between conditions to determine whether the changed object fully matches. (“Beta” tests.)

Saving Information

Rete saves all the information that it computes while processing each rule.

• It keeps track of the objects that match each condition based on only Alpha tests.

• For each initial sequence of conditions in a condition part, it keeps track of the lists of objects that match those conditions when Beta tests are also applied.

Order Rule

rule order

if {

low: Integer;

mid: Integer(

mid.intValue() > low.intValue());

hi: Integer(

hi.intValue() > mid.intValue());

} do {

delete(mid);

}

Execute:insert(new Integer(0));

insert(new Integer(1));

Alpha matches:low: Integer(0), Integer(1)

mid: Integer(0), Integer(1)

hi: Integer(0), Integer(1)

Alpha Matches

rule order

if {

low: Integer;

{Integer(0), Integer(1)}

mid: Integer(



hi: Integer(



} do {

delete(mid);

}


insert(new Integer(1));




Beta matches:[low, mid]: [Integer(0), Integer(1)]

[low, mid, hi]:

Beta Matches

rule order

if {

low: Integer;

mid: Integer(


{[Integer(0), Integer(1)]}

hi: Integer(


{ }

} do {

delete(mid);

}


Alpha matches:low: Integer(0), Integer(1), Integer(2)

mid: Integer(0), Integer(1), Integer(2)

hi: Integer(0), Integer(1), Integer(2)

Beta matches:[low, mid]: [Integer(0), Integer(1)],

[Integer(0), Integer(2)],

[Integer(1), Integer(2)]

[low, mid, hi]: [Integer(0), Integer(1), Integer(2)]

Beta Matches

rule order

if {

low: Integer;

mid: Integer(mid.intValue() > low.intValue());

{[Integer(0), Integer(1)],

[Integer(0), Integer(2)],

[Integer(1), Integer(2)]}

hi: Integer(hi.intValue() > mid.intValue());

{[Integer(0), Integer(1), Integer(2)]}

} do {

delete(mid);

}

Handling Deleted Objects

Processing deleted objects is fast. The engine keeps track of the saved data which each object is involved in, so when the object is deleted, the engine can directly remove all the stored match information.

Order Rule

rule order

if {

low: Integer;

mid: Integer(

mid.intValue()>low.intValue());

hi: Integer(

hi.intValue()>mid.intValue());

} do {

delete(mid);

}

After rule fires.




Beta matches:[low, mid]: [Integer(0), Integer(2)]

[low, mid, hi]:

Important Points

• Rete saves state, processing only the changed objects each cycle.

• For each initial sequence of conditions in a condition part, Rete keeps track of the lists of objects that match those conditions.

DEALING WITH EXPENSIVE RULES

Finding Expensive Rules

The expensive rules are not always the ones the developer suspects; profiling is essential

Rule engines have very different profiling tools.

– Must check the documentation for the engine you are using.

OPSJ – Java Profiler

The Slowest Rule

rule start_visit_3_junction

if {

stg: Stage(stg.value == "labeling");

junct: Junction(junct.kind == "3j",

junct.visited == "no");

} do {

junc.visited = “now”;

stg.value = visiting_3j";

update(junc, stg);

}

Excessive Beta Matches

In many cases, expensive rules are caused by creating excessive beta matches.

Typical issues:

– Computing more information than is needed.

– Throwing away information and recomputing it.

How Many Junctions Are There?


if {




} do {


stg.value = “visiting_3j”;

update(junc, stg);

}

Updating the Stage CausesBeta Matches to be Discarded


if {

stg: Stage(stg.value == "labeling“);



} do {



update(junc, stg);

}

Option 1


if {




} do {


update(junc);

insert(new Stage(“visiting_3j”));

}

Option 1: Continued

To accommodate this change, other rules must be modified to use insert/delete of stages as well.

Option 2


if {


junct: from j:Junction(j.kind == "3j",

j.visited == "no")

TakeAny;

} do {



update(junc, stg);

}

Effects of Changes

• Option 1 computes a lot of state information, but does not throw it away while it is still useful.

• Option 2 computes only the state that is needed at each step.

OPSJ – Java Profiler

Initial Boundary Rules

rule initial_boundary_junction_L

if {

stg: Stage(stg.value ==

"find_initial_boundary");

junct: Junction(junct.kind == "2j");

edge1: Edge (edge1.p1 == junct.base_point,

edge1.p2 == junct.p1);



!j2: Junction (j2.base_point >

junct.base_point);

} do { . . . }

Expensive Idioms

Another common problem is using expensive idioms in rule conditions.

Maximize Idiom


if {








!j2: Junction(j2.base_point >

junct.base_point);

} do { . . . }

Why This Is Expensive

• When comparing a collection of beta matches with a collection of alpha matches, all tests are not equally expensive.

• Equality tests can be handled quite efficiently.

• Order tests (<, >, etc.) take more time to evaluate.

The Slow Comparison


if {








!j2: Junction(j2.base_point >

junct.base_point);

} do { . . . }

Avoiding the Idiom


if {



junct: from j:Junction

TakeMax(j.base_point);

test (junct.kind == "2j");





} do { . . . }

OTHER CONSIDERATONS

When To Optimize

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.”

- Donald Knuth

When To Optimize

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

“Yet we should not pass up our opportunities in that critical 3%.”

- Donald Knuth

NP-Complete Problems

There is a class of problems known as “NP-Complete” problems.

There is no known algorithm that can solve NP-Complete problems in polynomial time.

The Bad News

Rule matching is an NP-Complete problem.

But…

Processing SQL queries is also an NP Complete problem.

CONCLUSIONS

Understand Rule Engine Performance

• Rete is a state-saving algorithm; on each cycle it maps from working memory changes to changes in rule matching information.

• Usually, the constant (“alpha”) tests are not a problem.

• The variable (“beta”) tests can be a problem.

Profiling

• Generally, only a few rules will impact performance significantly.

• It is important to profile the rules, and not guess at the culprits.

Dealing With Expensive Rules

Typical problems to look for:

– Computing more information than is needed.

– Throwing away information that is still useful.

– Using expensive idioms.

There are problems that are inherently expensive. (But we can hope they are rare.)

Thank You

decision camp 2014 - charles forgy - affecting rules performance

Technology

intvalue low

intvalue mid

thousands of rules

satisfied rules

cycle n

selected rules

inference rules

monitoring rules