self organization: competitive learning cs/cmpe 537 – neural networks

Self Organization: Competitive Learning

CS/CMPE 537 – Neural Networks

CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 2

Introduction

Competition for limited resources is a way of diversifying and optimizing a distributed system E.g. a sports tournament, evolution theory

Local competition leads to optimization at local and global levels without the need for global control to assign system resources

Neurons compete among themselves to decide the one (winner-takes-all) or ones that are activated Through competitive learning the output neurons form a

topological map unto the input neurons Sefl-organizing feature maps, where the networks learns the

input and output maps in a geometrical sense


Competition

Competition is intrinsically a nonlinear operation Two types of competition

Hard competition means that only one neuron wins the resources

Soft competition means that there is a clear winner, but its neighbors also share a small percentage of the system resources


Winner-Takes-All Network

Inputs xi (i = 1, p); output is

yi = 1 if neuron i wins the competition

yi = 0 otherwise The winning criterion can be any rule, as long as one clear

winner is found (e.g. largest output, smallest output)

The winner-takes-all network has one layer with p neurons, self-excitatory connections, and lateral inhibitory connections to all other neurons When a input is applied, the network evolves until the output

of the winning neuron tends to 1 while the outputs of the other neurons tends to 0

Winner-takes-all network acts as a selector


Instar Network (1)

Instar network is made up of McCulloch-Pitts neurons Instar neuron is able to recognize a single vector

pattern based on a modified Hebbian rule that automatically accomplishes weight normalization for normalized inputs

p

p


Instar Network (2)

Learning rule

wji(n + 1) = wji(n) + ηyj(n)[xi(n) – wji(n)] y is the hard-limiter output of either 0 or 1 The rule modifies the weights for neuron j if neuron j is

active (wins). Otherwise, the weights remain unchanged. This rule is similar to Oja’s rule, except that the weights are

not multiplied by the output


Instar Network (3)

The effect of the instar rule is to move the weight closer to the input in a straight line proportional to η

A trained instar neuron provides a response of 1 for data samples located near the training pattern (the bias controls the size of the neighborhood)


Competitive Rule for Winner-Takes-All Network The hard-limiter in the instar network acts as a gate that

controls the update of weights If a winner-takes-all network is used, the gating

mechanism is not needed, as the winning neuron is decided separately

Update rule

wj*i(n + 1) = wj*i(n) + η[xi(n) – wj*i(n)] j* = winning neuron


Criterion for Competition (1)

What criterion should be used to determine the winning neuron? The criterion should have some geometrical relevance

Inner product A neuron wins if the input is closest to the neuron (i.e. its weights) However, this only works when the input is normalized Winning neuron is the one with the largest inner product


Criterion for Competition (2)

Euclidean distance

||x – w|| = [Σi (xi – wi)2]1/2 Winning neuron is the one with the smallest Euclidean

distance Less efficient than inner product

L1 norm or Manhattan distance

|x – w| = Σi (xi – wi) Winning neuron is the one with the smallest L1 norm


Clustering

The competitive rule allows a single-layer linear network to group and represent data samples that lie in a neighborhood of the input space Each neighborhood is represented by a single output neuron The weights of the neuron represent the center of the cluster Thus, a single-layer linear competitive network performs

clustering

Clustering is a ‘divide and conquer’ process where an input space is divided into groups of proximal patterns

Vector quantization – an example of clustering


k-Means Clustering Algorithm (1)

Classical algorithm for data clustering Cluster data vectors xi (i = 1, N) into k cluster Ci (i = 1,

k) such that the distance between the vectors and their respective cluster centers ti (i = 1, k) is minimized

Algorithm Randomly assign vectors to class Ci

Find center of each cluster Ci

ti = (1/Ni) ΣnЄCi xn

Reassign the vectors to their nearest cluster, i.e., assign vector i to cluster k if ||xi – tk||2 is minimum for all k

Repeat the last two steps until no changes occur


k-Means Clustering Algorithm (2)

Performance measure or function

J = Σi=1 kΣnЄCi ||xn – ti||2 This measure is minimized in the k-means algorithm

Estimating J with its instantaneous value, then taking derivative wrt ti

δJ(n)/ δti = -η[x(n) – ti(n)]and, the change becomes (to minimize J)

Δti(n) = η[x(n) – ti(n)] This is identical to the competitive update rule, when ti

is replaced by wi Thus, the single-layer linear competitive network performs k-

means clustering of the input data


k-Means Clustering - Example

Cluster [0, 1, 3, 6, 8. 9, 10, 11] into 2 groups. N = 8, k = 2

C1 = [0, 1, 3, 6]; C2 = [8, 9, 10, 11] t1 = 10/4 = 2.5; t2 = 38/4 = 9.5 C1 = [0, 1, 3]; C2 = [6, 8, 9, 10, 11] t1 = 4/3 = 1.33; t2 = 44/5 = 8.8 C1 = [0, 1, 3]; C2 = [6, 8, 9, 10, 11]

No change. Thus, these are the final groupings and clusters


Determining the Number of Clusters

The number of clusters k (the number of output neurons) has to be decided apriori How is k determined ?

There is no easy answer without prior knowledge of the data that indicate the number of clusters If k is greater than the number of clusters in data, then

natural clusters are split into smaller regions If k is less than the number of clusters in data, then more than

one natural cluster is grouped together An iterative solution

Start with a small k, cluster and find value of J Increment k by 1, cluster and find value J Repeat until J starts to increase


Soft Competition

Hard competition = winner-takes-all. That is, the winner neuron is active while the rest are inactive

Soft competition = winner-shares-reward. That is, the winning neuron activates its neighboring neurons as well The winning neuron has the highest output, while its

neighboring neurons’ outputs decrease with increase in distance

Consequently, weights of all active neurons are modified, not just that of the winning neuron

Soft competition is common in biological neural systems


Kohonen Self-Organizing Maps (1)

Kohonen SOMs are single-layer competitive networks with a soft competition rule

They perform a mapping from a continuous input space to a discrete output space, preserving the topological properties of the input

Topological map points close to each other in the input space are mapped to

the same or neighboring neurons in the output space There is topological ordering or spatial significance to the

neurons in the output and input layer


Kohonen SOMs (2)


Kohonen SOMs (3)

1-D SOM has output neurons arranged as a string, with each neuron having two adjacent neighbors

2-D SOM has output neurons arranged as an 2-D array, with more flexible definition of neighborhoods


Kohonen Learning Algorithm (1)

Competitive rule

wj(n + 1) = wj(n) + hj,i*(n) η(n) [x(n) – wj(n)] hj,j* = neighborhood function for winning neuron i*

A common neighboring function is the Gaussian

hj,i*(n) = exp(- dj,i*2 / 2σ(n)2)


Kohonen Learning Algorithm (2)

Initialization: Choose random values for the initial weight vectors. Make sure the weights for each neuron are different and of small magnitude

Sampling and similarity matching: select an input vector x. Find the best-matching (winning) neuron at time n, using the minimum-distance Euclidean criterion.

Updating: adjust the weight vectors using the Kohonen learning rule

Continuation: Repeat the previous two steps until no noticeable changes occur in the feature map are observed


Example (1)

Create a 1-D feature map of a 2-D input space. The input vectors x lie on a unit circle.

No. of inputs = 2 (defined by the problem, 2-D) No. of outputs = 20 (our choice, 1-D linear array or

lattice) Randomly initialize the weights so that each neuron

has a different weight (in this example weights are initialized along the x-axis, i.e, the y component is zero)

See the MATLAB code for details


Example (2)

Consider the input x = [0.7071 0.7071]T Assume the current weights are W = [ 0 0:19] = [0 0; 0 1; 0 2; 0 3;…;0 19]

Winning neuron is the one with smallest norm from the input vector Winning neuron is 2 (weight vector w2 = [0 1]T) Norm = (0.70712 + 0.29292)1/2 = 0.7674 (this is the smallest

distance from the weight vectors) Update of weights

wj(n + 1) = wj(n) + hj,j*(n) η(n) [x(n) – wj(n)] Assume Gaussian neighborhood function with variance of

0.5


Example (3)

Neuron 1

w1(n + 1) = w1(n) + h1,2*(n) η(n) [x(n) – w1(n)]

= [0 0] + 0.3679([0.7071 0.7071] – [0 0])

= [0.2601 0.2601] Neuron 2 (winning neuron)

w2(n + 1) = w2(n) + h2,2*(n) η(n) [x(n) – w2(n)]

= [0 1] + (1)([0.7071 0.7071] – [0 1])

= [0.7071 0.7071]


Example (4)

Neuron 3

w3(n + 1) = w3(n) + h3,2*(n) η(n) [x(n) – w3(n)]

= [0 2] + 0.3679([0.7071 0.7071] – [0 2])

= [0.2601 1.5243] Neuron 4

w4(n + 1) = w4(n) + h4,2*(n) η(n) [x(n) – w4(n)]

= [0 3] + 0.0183([0.7071 0.7071] – [0 3])

= [0.0129 2.9580] Neuron 5 and beyond

There update will be negligible because the Gaussian tends to zero with increasing distance

self organization: competitive learning cs/cmpe 537 – neural networks

Documents