iterative optimization of hierarchical clusterings doug fisher department of computer science,...

31
Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence Research 4 (1996) 147-179 Presentation: Yugong Cheng

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

Iterative Optimization of Hierarchical Clusterings

Doug FisherDepartment of Computer Science, Vanderbilt University

Journal of Artificial Intelligence Research 4 (1996) 147-179

Presentation: Yugong Cheng

04/23/02

Page 2: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

2

Outline

• Introduction• Objective Function• Iterative Optimization Methods and

Experiments• Simplification of Hierarchical Clustering• Conclusion• Final Exam Questions Summary

Page 3: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

3

Introduction

• Clustering is a process of unsupervised learning, which groups objects into clusters.

• Major Clustering Methods– Partitioning– Hierarchical– Density-based– Grid-based– Model-based

Page 4: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

4

Introduction (Continued)

• Clustering systems differ in• objective function • control strategy

• Usually a search strategy cannot be both computationally inexpensive and give any guarantee about the quality.

Page 5: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

5

Introduction (Continued)

This paper discusses the use of iterative optimization and simplification to construct clusters that satisfy both conditions:

• High quality

• Computationally inexpensive

The suggested method involves two steps:• Constructing a clustering inexpensively

• Using an iterative optimization method to improve the clustering

Page 6: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

6

Category Utility

• CU(CK) = P(Ck)[P(Ai = Vij |CK)2 -P(Ai = Vij)2]

• PU({C1, C2, … CN}) = k CU(CK)/N

Where an observation is a vector of Vij along attributes(or variables) Ai

• This measure rewards clusters Ck, that increases the predictability of Vij within Ck (i.e. P(Ai=Vij|Ck)) relative to their predictability in the population as a whole (i.e. P(Ai = Vij))

Page 7: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

7

Page 8: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

8

Hierarchical Sorting

• Given an observation and current partition, evaluate the quality of the clusterings that result from– Placing the observation in each of the existing

clusters– Creating a new cluster that only covers the new

observation

• Select the option that yields the highest quality score (PU)

Page 9: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

9

Page 10: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

10

Iterative Optimization Methods

• Reorder-resort (Cluster/2): seed selection, reordering, and re-clustering.

• Iterative redistribution of single observation: moving single observation one by one.

• Iterative hierarchical redistribution: moving clusters together with its sub-tree.

Page 11: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

11

Reorder-resort (k-mean)

k-mean: k random seeds are selected, and k clusters are growing around these attractors; the centroids of the clusters are picked as new seeds, new clusters are growing. The process iterates until there is no further improvement in the quality of generated clustering.

Page 12: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

12

Reorder-resort (k-mean)

• Ordering data to make consecutive observations dissimilar based on Euclidean distance leads to good clusterings

• Extracting biased “dissimilarity” ordering from the hierarchical clustering

• Initial sorting, extraction dissimilarity ordering, re-clustering

Page 13: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

13

Iterative Redistribution of Single Observations

• Moves single observations from cluster to cluster

• A cluster contains only one observation is removed and its single observation is resorted

• Iterate until two consecutive iterations yield the same clustering

Page 14: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

14

• The ISODATA algorithm determines a target cluster for each observation but does not move the cluster until targets for all observations have been determined

• A sequential version that moves each observation as its target is identified through sorting

Single Observation

Redistribution Variations

Page 15: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

15

Iterative Hierarchical Redistribution

• Takes large steps in the search for a better clustering

• Resorts sub-tree instead of single observation

• Tree removal requires that the various counts of ancestors’ be decremented. Also, the host cluster’s variable value counts needs to be incremented.

Page 16: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

16

Scheme

• Given an existing hierarchical clustering, a recursive loop examines sibling clusters in the hierarchy in a depth first fashion.

• An inner, iterative loop examines each sibling based on the objective function. And repeats until two consecutive iterations lead to the same set of siblings.

Page 17: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

17

(Continued)

• The recursive loop then turns its attention to the children of each of these remaining siblings.

• Finally the leaves will be reached and resorted.

• The recursive loop will be applied several times until there are no changes that occur from one pass to the next.

Page 18: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

18

Page 19: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

19

Experiment conditions

– The initial clustering is generated by hierarchical sorting on

• random ordering observations

• similarity ordering observations, which samples observations within the same region before sampling observations from differing regions.

– Optimization strategies are applied– Assume the primary goal of clustering is to

discover a single-level partitioning of the data that is of optimal quality

Page 20: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

20

Comparison between Iterative Optimization Strategies

Page 21: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

21

Main findings from the Table:

• Hierarchical redistribution achieves the highest mean PU scores in both the random and similarity case in 3 of 4 domains.

• Reordering and re-clustering comes closest to hierarchical redistribution’s performance in all cases, better it in 1 domain.

• Single-observation redistribution modestly improves an initial sort, and is substantially worse than the other two optimization methods.

Page 22: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

22

Time requirements

Page 23: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

23

Level of Tree

Page 24: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

24

Simplifying Hierarchical Clustering

• Simplify hierarchical clustering and minimize classification cost

• Minimize Error Rate

• Validation set to identify the frontier of clusters for prediction of each variable

• Node lies below the frontier of every variable would be pruned

Page 25: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

25

Validation

• For each variable, Ai, the objects from the validation set are each classified through the hierarchical clustering with the value of variable Ai “masked” for purposes of classification.

• At each cluster encountered during classification the observation’s value for Ai is compared to the most probable value for Ai at the cluster.

• A Count of all correct predictions for each variable at a cluster is maintained.

• A preferred frontier for each variable is identified that maximizes the number of correct counts for the variable.

Page 26: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

26

Page 27: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

27

Page 28: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

28

Concluding Remarks

• There are three phases in searching the space of hierarchical clusterings:– Inexpensive generation of an initial clustering– Iterative optimization for clusterings– Retrospective simplification of generated

clusterings

• The new method, hierarchical redistribution optimization works well.

Page 29: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

29

Final Exam Questions

1. The main idea of the paper is to construct clusterings which satisfy two conditions, 1) name the conditions, 2) name the two steps to satisfy the conditions

1) To construct clusterings that satisfy both conditions: high quality and computationally inexpensive

2) First constructs a clustering inexpensively (hierarchical sorting), then uses an iterative optimization method to improve the quality of clustering (reorder-resort, iterative single redistribution, hierarchical redistribution).

Page 30: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

Final Exam Question2. Describe the three iterative methods for clustering optimization:Reorder-resort (k-mean): Extracting biased “dissimilarity” ordering from the

initial hierarchical clustering, then performing k-mean partitioning iteratively.

Iterative redistribution of single observation: moving single observation one by one. A cluster contains only one observation is removed and its single observation is resorted. Iterating until two consecutive iterations yield the same clustering.

Hierarchical redistribution: Takes large steps in the search for a better clustering. It resorts sub-tree instead of single observation.

• Given an existing hierarchical clustering, a recursive loop examines sibling clusters in the hierarchy in a depth first fashion.

• An inner, iterative loop examines each sibling based on the objective function. And repeats until two consecutive iterations lead to the same set of siblings

• The recursive loop then turns its attention to the children of each of these remaining siblings.

• Finally the leaves will be reached and resorted. • The recursive loop will be applied several times until there are no

changes that occur from one pass to the next.

Page 31: Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence

31

Final Exam Question

3. (1) The cluster is better when the relative CU score is a) big, b) small, c) equal to 0.

The cluster is better with a higher CU score. So choose a).

(2) Which sorting method is better? a) random sorting, b) similarity sorting.

Dissimilar ordering will yield better clustering, so random sorting of samples will be better. Choose a).