bcs talk - what sells well and when?

Post on 12-May-2015

151 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A 1 hour talk given to the BCS on 7th February about using computational intelligence in data mining.

TRANSCRIPT

What sells well and when?

Stephen G. Matthews

Centre for Computational Intelligence (CCI)De Montfort University

7th February 2012 / BCS meeting

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 1 / 36

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 2 / 36

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 3 / 36

Association Rules

Significant correlations between items in datasets.

Uses: positioning stock on shelves, inventory control andcross-selling.

Descriptive data mining.

Example Rule20% of customers matched the rule

IF pizza THEN beer

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 4 / 36

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Example

TID Cheese Beer Pizza1 1 1 12 1 0 13 0 1 14 0 1 05 1 0 0

Support(pizza ⇒ beer) =σ(pizza ∪ beer)

N=

25= 0.4

Confidence(pizza ⇒ beer) =σ(pizza ∪ beer)

σ(pizza)=

23= 0.6̇

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 6 / 36

Example

TID Cheese Beer Pizza1 1 1 12 1 0 13 0 1 14 0 1 05 1 0 0

Support(pizza ⇒ beer) =σ(pizza ∪ beer)

N=

25= 0.4

Confidence(pizza ⇒ beer) =σ(pizza ∪ beer)

σ(pizza)=

23= 0.6̇

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 6 / 36

Fuzzy Association Rules

Represent quantities of items with words.

Interpretable and comprehensible.

Uncertainty in data (e.g., web server log) and linguistic uncertainty(human interpretation).

Imprecision in data (physical measurements from weighing goodsin a butchers, a fishmongers and a sweet shop).

Example Rule20% of customers matched the rule

IF quantity of pizza is high THEN quantity of beer is high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 7 / 36

What are Fuzzy Sets?Fuzzy sets have elements that have degrees of membership in[0, 1].

Crisp boundary problem.

1

0

µ

6ft 7ft height

(a) A crisp set tall

1

0

µ

6ft 7ft height

(b) A fuzzy set tall

“Fuzzy Logic: An Introduction” - Award-winning video from the CCI

◮ https://www.youtube.com/watch?v=P8wY6mi1vV8◮ http://www.cci.dmu.ac.uk/news-archive/212/

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 8 / 36

Example

TID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1

2016Quantity

1

0.58

01

µ low medium high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 9 / 36

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 10 / 36

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 10 / 36

Temporal Association Rules

Lifespan Occurs over a period of time, e.g., one week.

Cyclic Recurs at regular intervals.

Calendar Occurs in periods defined with a calendar, e.g., 1stJanuary 1970.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 11 / 36

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 12 / 36

The goal

Example Rule20% of customers matched the following rule on one Friday evening

IF quantity of pizza is high THEN quantity of beer is high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 13 / 36

Traditional Approach1 Define linguistic labels and membership function parameters

(clustering, GA and uniform partitioning).2 Mine rules using the linguistic labels and membership functions,

e.g., cheese.l , cheese.m, cheese.h, beer.l , beer.m, . . .

20Quantity

1

01

µ low medium high

Assumes that the membership functions stay the same throughout theentire dataset . . .

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 14 / 36

Losing RulesTID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 15 / 36

Losing RulesTID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1

Quantities at intersection of membership function boundaries.

2016Quantity

1

0.58

01

µ low medium high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 15 / 36

Losing Rules

Less prominent across entire dataset, but more prominent in atemporal period.

Rigid definition of membership functions.

20Quantity

1

01

µ low medium high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 16 / 36

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

FuzzySupport(pizza.m ⇒ beer.h)

=

∑5i=1 min(pizza.m, beer.h)

N

=0.58 + 0.42

5= 0.2

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 17 / 36

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

FuzzySupport(pizza.m ⇒ beer.h)

=

∑5i=1 min(pizza.m, beer.h)

N

=0.58 + 0.42

5= 0.2

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 17 / 36

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 18 / 36

2-tuple Linguistic Representation

Displace the membership function left or right.

Overcomes lack of flexibility.

Maintains interpretability whilst discovering more temporal rules.

20Quantity

1

01

µ s0 s1 s2

(s1,−0.3)

α = −0.3

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 19 / 36

Search for Rules with a Genetic Algorithm (GA)

What is a GA?

Search method based on principles of genetics and naturalselection.

Solution to a problem encoded in a chromosome.

Many solutions compete in a population.

Performance of solutions measured with fitness function.

Population of solutions evolve over time.

Particularly good in large and complex search spaces.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 20 / 36

Why is it used for this problem?

Searches for rules.

Combination of different search spaces.Simultaneously:

◮ Tunes lateral displacements of membership functions.◮ Discovers a rule.◮ Discovers temporal period of a rule.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 21 / 36

Chromosome

C = (el , eu, i1, s1, α1, a1, . . . , ik , sk , αk , ak )

el lower endpoint

eu upper endpoint

i item (e.g., beer)

s linguistic label (e.g., high)

α lateral displacement

a antecedent/consequent flag

k number of items in rule

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 22 / 36

Fitness Evaluation

Fitness(C) = TemporalFuzzySupport(C) + Confidence(C) (4)

Fitness(C) =

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )

eu − el

(5)

+

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )∑eu

j=elFuzzySupport(C(j)

X )

.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 23 / 36

Fitness Evaluation

Fitness(C) = TemporalFuzzySupport(C) + Confidence(C) (4)

Fitness(C) =

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )

eu − el

(5)

+

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )∑eu

j=elFuzzySupport(C(j)

X )

.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 23 / 36

Iterative Rule Learning

GA is run many times.

Best rule from each run of GA is stored.

Previously discovered rules penalised in fitness function.

Begin

Run GA

Max. rules? Add to rule set

End

Yes

No

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 24 / 36

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 25 / 36

MethodologyAims:

1 Improve existing rules discovered with traditional approach2 Discover new rules not discovered with traditional approach

Compare rules produced from GA (CHC) and traditional exhaustivesearch (FuzzyApriori).

Define membership functionsand linguistic labels

One datasetEnumerate partitions

of dataset

CHC FuzzyApriori

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 26 / 36

Dataset

IBM Quest synthetic dataset.

Benchmark dataset for association rule mining.

Parameters: 10,000 transactions, 64 items and quantities in therange 1–20.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 27 / 36

General results

Measure GA FuzzyAprioriNumber of Rules 10000 90325Average temporal fuzzy support 0.025 0.031Average confidence (%) 99.986 24.187Mode of dataset partitions 100 100

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 28 / 36

General results in pictures

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 29 / 36

What rules have improved?

GA (CHC) found rules that were also discovered with exhaustivesearch (FuzzyApriori).

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

CHC and FuzzyApriori 10.49 10.78 21.27Only CHC 4.26 74.47 78.73

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 30 / 36

What rules are new?

GA (CHC) found rules that were NOT discovered with exhaustivesearch (FuzzyApriori).

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

CHC and FuzzyApriori 10.49 10.78 21.27Only CHC 4.26 74.47 78.73

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 31 / 36

Why were the new rules lost?

FuzzyApriori discarded rules that fell below minimum thresholds.

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

Below min. temporal support 3.73 73.98 77.71Below min. confidence 0.53 00.49 1.02

78.73

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 32 / 36

What rules are now above the min. thresholds?

Rules that are above minimum thresholds after CHC.

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

Below min. temporal support 0 24.65 24.65Below min. confidence 0.23 00.50 0.73

25.38

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 33 / 36

What does a rule look like?Endpoints: 9300–9400Rule: IF quantity of Item38 is (medium, -0.422)Rule: THEN quantity of Item12 is (medium, 0.315)

20

Quantity

1

1

µ medium

(medium,−0.422)

α = −0.422

Item38

20

Quantity

1

1

µ medium

(medium, 0.315)

α = 0.315

Item12

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 34 / 36

Summary

Temporal rules can be lost by fixing membership functions.

2-tuple provides the flexibility required to discover these rules.

Analysis has unearthed lost rules.

Real-world datasets . . .

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 35 / 36

Thank you

Stephen G. Matthews◮ sgm@dmu.ac.uk◮ www.slideshare.net/stephengmatthews

“Fuzzy Logic: An Introduction”◮ https://www.youtube.com/watch?v=P8wY6mi1vV8◮ http://www.cci.dmu.ac.uk/news-archive/212/

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 36 / 36

top related