09 -1 lecture 09 clustering-based learning topics –basics –k-means –self-organizing maps...

33
09<Clustering>-1 Lecture 09 Clustering-based Learning • Topics – Basics – K-Means Self-Organizing Maps – Applications – Discussions

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-1

Lecture 09 Clustering-based Learning

• Topics– Basics– K-Means– Self-Organizing Maps– Applications– Discussions

Page 2: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-2

Basics

• Clustering– Grouping a collection of objects

(examples) into clusters, such that objects are most similar inside each cluster and least similar between clusters.

– Core problem: similarity definition• Intra cluster similarity• Inter cluster similarity

– Inductive learning– Unsupervised learning

Page 3: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-3

Basics• Minimizing intra cluster dissimilarity

is equivalent to maximizing inter cluster dissimilarity

• Clustering performance in terms of Intra cluster dissimilarity:

– K for K clusters and d(xi,xi’) for dissimilarity measure

i ikiikk

K

kk

xxdWwhere

WCDS

''

1

),(2

1

,)(

Page 4: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-4

Basics

• Dissimilarity measure depends on value types and value coding systems

• Some examples– Quantitative variables:

– Ordinal variables:

– Categorical variables:

)(),( '' iiii xxlxxd

M

ixi

2/1

otherwise

xxifxxd ii

ii ,1

,0),( '

'

Page 5: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-5

Basics

• Clustering algorithms– Combinatorial Algorithms

• Work directly on the observed data• K-Means• Self-Organizing Maps

Page 6: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-6

K-Means

• A statistical learning mechanism• A given object is assigned to a

cluster if it has least dissimilarity to the mean value of the cluster.

• Euclidean or Manhattan distance is commonly used to measure dissimilarity

• The mean value of each cluster is recalculated in each iteration

Page 7: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-7

K-Means• Step 1: Selecting Centers

Selects k objects randomly, each becoming the center (mean) of an initial cluster.

• Step 2: ClusteringAssign each of the remaining objects to the cluster with the nearest distance. The most popular method for calculating distance is Euclidean distance. Given two points p = ( p1, p2, …, pk ) and q = ( q1, q2, …, qk ), their Euclidean distance is defined as:

2i

k

1ii )qp(d

Page 8: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-8

K-Means

• Step 3: Computing New CentersCompute new cluster centers. Let xi be one of the elements assigned to the kth cluster, and Nk be the number of elements in the cluster. The new center of cluster k, Ck, is calculated as:

• Step 4: IterationRepeat steps 2 and 3 until no members change their clusters.

k

Nii

k N

x

C k

1

Page 9: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-9

K-Means

• Example

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

K=2

Arbitrarily choose K object as initial cluster center

Assign each object to most similar center

Update the cluster means

Update the cluster means

reassign

reassign

Page 10: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-10

K-Means

• Usually, the problem itself has the setting of K

• If K is not given, in order to find the best K, we examine the intra cluster dissimilarity Wk, which is a function of K

• Usually Wk decreases with increasing K

Page 11: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-11

K-Means

• Decide K

A sharp drop of Wk

is observed

Page 12: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-12

K-Means

• Hierarchical Clustering

Page 13: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-13

K-Means

• Agglomerative Hierarchical Clustering

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

Page 14: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-14

K-Means

• Divisive Hierarchical Clustering

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

Page 15: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-15

Self-Organizing Maps

• Brain self-organizing structure – Our brain is dominated by the cerebral

cortex, a very complex structure of billions of neurons and hundreds of billions of synapses.

– The cortex includes areas that are responsible for different human activities (motor, visual, auditory, etc.), and associated with different sensory inputs.

– We can say that each sensory input is mapped into a corresponding area of the cerebral cortex.

– The cortex is a self-organising computational map in the human brain.

Page 16: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-16

Self-Organizing Maps

• The self-organising map (SOM) provides a topological mapping emulating the cortex structure. It places a fixed number of input patterns from the input layer into a higher-dimensional output or Kohonen layer.

• SOM is a subsymbolic learning algorithm; data input need to be numerically coded.

Page 17: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-17

Self-Organizing Maps

Input layer

Kohonen layer

(a)

Input layer

Kohonen layer

1 0 (b)

0 1

Page 18: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-18

Self-Organizing Maps Training of SOM is based on

competitive learning: Neurons compete among themselves to be activated, but only a single output neuron can be active at any time.

• The output neuron that wins the “competition” is called the winner-takes-all neuron

• Training in SOM begins with the winner’s neighborhood of a fairly large size. Then, as training proceeds, the neighborhood size gradually decreases.

Page 19: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-19

Self-Organizing Maps

• Conceptual architecture

Input layer

O u

t p

u t

S

i g

n a

l s

I n p

u t

S

i g

n a

l s

x1

x2

Output (Kohonen) layer

y1

y2

y3

Page 20: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-20

Self-Organizing Maps• The lateral connections are used to create a

competition between neurons. The neuron with the largest activation level among all neurons in the output layer becomes the winner. This neuron is the only neuron that produces an output signal. The activity of all other neurons is suppressed in the competition.

• The lateral feedback connections produce excitatory or inhibitory effects, depending on the distance from the winning neuron. This can be achieved by the use of a Mexican hat function which describes synaptic weights between neurons in the Kohonen layer.

Page 21: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-21

Self-Organizing Maps

Connectionstrength

Distance

Excitatoryeffect

Inhibitoryeffect

Inhibitoryeffect

0

1

• Mexican hat function of lateral connection

Page 22: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-22

SOM – Competitive Learning Algorithm

• Step 1: InitializationSet initial weights to small random values, say in an interval [0, 1], and assign a small positive value, e.g., 0.2 to 0.5, to the learning rate parameter 0.

Page 23: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-23

SOM – Competitive Learning Algorithm

• Step 2: Activation and Similarity Matching

Activate the SOM by applying the input vector X, and find the best matching neuron JX at iteration p, using the minimum Euclidean distance criterion

where n is the number of neurons in the input layer, m is the number of neurons in the Kohonen layer, and j = 1, 2, …m.

21

1

2

/n

iiji

jj

j][ )p(wxminarg)p(argmin)p(J

WXX

Page 24: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-24

SOM – Competitive Learning Algorithm

• Step 3: Learning(a) Calculate weight corrections according to the competitive learning rule:

where

ΛJ: neighborhood of neuron J, d0: initial neighborhood size and T: total repetitions.

)(,0

)(, )( )( )(

][pj

pjpwxppw

J

Jijiij

)/1()(

)},()(|{)(

),/1()(

0

0

Tpdpd

pdpJiip

Tpp

XJ

Page 25: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-25

SOM – Competitive Learning Algorithm

• Step 3: Learning (Continued)(b) Update the weights

where wij(p) is the weight correction at iteration p.

• Step 4: Iteration Increase iteration p by one, go back to Step 2 and continue until the minimum-distance Euclidean criterion is satisfied, or no noticeable changes occur in the feature map.

)()()1( pwpwpw ijijij

Page 26: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-26

Self-Organizing Maps

• SOM is online K-Means

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

New Object

Page 27: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-27

Self-Organizing Maps• Example: A SOM with 100 neurons

arranged in the form of a two-dimensional lattice with 10 rows and 10 columns. It is required to classify two-dimensional input vectors each neuron in the network should respond only to the input vectors occurring in its region.

• The network is trained with 1000 two-dimensional input vectors generated randomly in a square region in the interval between –1 and +1. The learning rate parameter is fixed, equal to 0.1.

Page 28: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-28

Self-Organizing Maps

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1W(1,j)

-1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

W(2

,j)

Initial random weights

Page 29: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-29

Self-Organizing Maps

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1W(1,j)

-1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

W(2

,j)

100 repetitions

Page 30: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-30

Self-Organizing Maps

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1W(1,j)

-1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

W(2

,j)

1,000 repetitions

Page 31: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-31

Self-Organizing Maps

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1W(1,j)

-1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

W(2

,j)

10,000 repetitions

Page 32: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-32

Applications

• K-Means– Cluster ECG signals according to

Correlation Dimensions

• Self-Organizing Maps– Find churner groups– Speech recognition

Page 33: 09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

09<Clustering>-33

Discussions

• Clustering events with attribute-based representation– Attribute-based similarity measure for

clusters – Hierarchical clustering of event

sequences– Generalization, e.g.,

• “A ∧ B ∧C” generalized to “A ∧ B” • “A ∨ B” generalized to “A ∨ B ∨ C”• ontology

– Specialization