exploratory mining and pruning optimization of constrained associations rules

22
1998 년 8 년 7 년 Data Engineering Lab 년 년년 1 Exploratory Mining and Pruning Optimization of Constrained Associations Rules

Upload: angus

Post on 25-Feb-2016

41 views

Category:

Documents


1 download

DESCRIPTION

Exploratory Mining and Pruning Optimization of Constrained Associations Rules. Abstract. Standpoint of supporting human-centered discovery of Knowledge lack of user exploration and control lack of focus rigid notion of relationship Constrained association queries - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 1

Exploratory Mining and Pruning Optimization of Constrained

Associations Rules

Page 2: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 2

Abstract

• Standpoint of supporting human-centered discovery of Knowledge– lack of user exploration and control– lack of focus– rigid notion of relationship

• Constrained association queries– pruning using monotonicity, succinctness

Page 3: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 3

Introduction• Problem1 (Lack of User Exploration

and Control)– Mining Process => Black Box – (user can’t preempt and needs to wait for hours)– establish clear breakpoints to allow user

feedback

• Problem2 (Lack of Focus)– on which to focus the mining to find association between sets of items whose

types do not overlap

Page 4: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 4

associations from item sets whose total price is at least $1,000

– provide a rich interface for the user to express focus (CAQ)

• Problem3 (Rigid notion of Relationship)– significance metrics :– separate criteria for selecting candidates for

the antecedent and consequent: association from items to sets of types pepsi => snacks

Page 5: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 5

Page 6: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 6

Architecture• Phase 1

– user initially specifies CAQ• includes a set of constraints C• C is applicable to the antecedent and consequent

– output: • pairs of candidates(Sa, Sc)

• Sa, Sc have support over thresholds

– user can add, delete, of modify the constraints as many times as desired

Page 7: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 7

• Phase 2– significance metric – a threshold for the metric– whatever further conditions to be imposed ont

the antecedent and consequent classical association mining - confidence (as significance metric) - confidence threshold - require ( SaSc) be frequent

Page 8: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 8

Page 9: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 9

Constrained Association Queries• CAQ

– S Item : S is a set variable on the Item domain

– {(S1, S2) |C}, C is a set of constraints on S1, S2

– frequent constraints freq(Si)– trans(TID, Itemset), iteminfo(Item, Type, Price)– S.price 100 : all items in S are of price less

than of equal to $100– {snacks, sodas} S.Type

Page 10: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 10

• CAQ Examples– {(S1, S2) | S1 Item & S2 Item & count(S1) = 1 & count(S2) = 1

& freq(S1) & freq(S2)}• S1.Type S2.Type and max(S1.Price) avg(S2.Price)

– {(S1, S2) | agg1(S1.Price) 100 & agg2(S2.Price 1000}

– {(S1, S2) | S1.Type {Snacks} & S2.Type {beers} & max(S1.Price) min(S2.Price)

• Sound/Complete– algorithm is sound if it only finds frequent sets that satisfy the

given constraints– algorithm is complete if all frequent sets satisfying the given

constraints are found

Page 11: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 11

• Goal– to push the constraints as deeply as possible

inside the computation of frequent set– classical algorithm + test them for constraint

satisfaction => too inefficient– sound/complete : anti-monotone, succinctness

Page 12: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 12

Anti-Monotone Constraints

• Find constraints which satisfy anti-monotone – prune away a significant num of candidates

• Definition – A 1-var constraint C is anti-monotone iff for all sets

S, S’: • S S’ & S satisfies C S’ satisfies C

• Identify which constraints are anti-monotone– Fig3– min(S) v (anti-monotone) , min(S) v (not )

Page 13: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 13

Page 14: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 14

Succinct Constraints• once-and-for-all (before any iteration takes place)

– not generate and test paradigm– how to

• succinctness • member generating functions

– definition• SATc(Item) : the set of item sets satisfying C , pruned space

– C1 S.Price 100 , pruned space for C1 contains only item sets such that each item in the set has a price at least $100

• selection predicate, p

Page 15: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 15

Page 16: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 16

Example

C1 S.Price 100 , let Item1 = price 100 (Item):

C1 is succinct because its pruned space SATc1(Item) is simply 2item1

C2 {snacks, sodas} S.Type : Let Item2, Item3 ,

Item4 be the sets type = ‘snacks’(Item), type = ‘sodas’(Item) , type ‘snacks’

type ‘sodas’ (Item)

C2 is succint SATC2(Item) can be expressed as 2item - 2item2 - 2item3 - 2item4 - 2item2 item4 - 2item3 item4

Page 17: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 17

Example

C1 S.Price 100, MGF = {X |X Item1 & C }

C2 {snacks, sodas} S.Type, MGF = {X1 X2 X3|

X1 Item2 & X1 & X2 Item3 & X2 & X3 Item4}

Page 18: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 18

Algorithms • Algorithm Apriori+

– computes the frequent set => among frequent set, those which satisfy constraints become answer set

• Algorithm Hybrid(m)– in case (C - Cfreq ) is more selective , apriori+ is

inefficient – First check Cfreq for m iterations – to reduce the remaining I/O cost, it switches to

checking (C- Cfreq)

Page 19: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 19

Page 20: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 20

CAP algorithm• 4 Cases succinct and Anti-monotone

– Replace C1 in the Apriori Algorithm by C1c

succinct but not anti-monotone

Page 21: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 21

Anti-monotone but Non-succinct – Define Ck as in apriori algorithm, drop the candidates S if S fails

C– constraint satisfaction is tested before counting is done

neither– Induce any weaker constraint C’ from C, depending on

whether C’ is anti-monotone and /or sucinct, use the above strategies

– Once all frequent sets are generated, test them for satisfaction of C

Page 22: Exploratory Mining and Pruning Optimization of Constrained Associations Rules

1998 년 8 월 7 일 Data Engineering Lab 성 유진 22