self organization: competitive learning cs/cmpe 537 – neural networks
TRANSCRIPT
Self Organization: Competitive Learning
CS/CMPE 537 – Neural Networks
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 2
Introduction
Competition for limited resources is a way of diversifying and optimizing a distributed system E.g. a sports tournament, evolution theory
Local competition leads to optimization at local and global levels without the need for global control to assign system resources
Neurons compete among themselves to decide the one (winner-takes-all) or ones that are activated Through competitive learning the output neurons form a
topological map unto the input neurons Sefl-organizing feature maps, where the networks learns the
input and output maps in a geometrical sense
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 3
Competition
Competition is intrinsically a nonlinear operation Two types of competition
Hard competition means that only one neuron wins the resources
Soft competition means that there is a clear winner, but its neighbors also share a small percentage of the system resources
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 4
Winner-Takes-All Network
Inputs xi (i = 1, p); output is
yi = 1 if neuron i wins the competition
yi = 0 otherwise The winning criterion can be any rule, as long as one clear
winner is found (e.g. largest output, smallest output)
The winner-takes-all network has one layer with p neurons, self-excitatory connections, and lateral inhibitory connections to all other neurons When a input is applied, the network evolves until the output
of the winning neuron tends to 1 while the outputs of the other neurons tends to 0
Winner-takes-all network acts as a selector
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 5
Instar Network (1)
Instar network is made up of McCulloch-Pitts neurons Instar neuron is able to recognize a single vector
pattern based on a modified Hebbian rule that automatically accomplishes weight normalization for normalized inputs
p
p
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 6
Instar Network (2)
Learning rule
wji(n + 1) = wji(n) + ηyj(n)[xi(n) – wji(n)] y is the hard-limiter output of either 0 or 1 The rule modifies the weights for neuron j if neuron j is
active (wins). Otherwise, the weights remain unchanged. This rule is similar to Oja’s rule, except that the weights are
not multiplied by the output
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 7
Instar Network (3)
The effect of the instar rule is to move the weight closer to the input in a straight line proportional to η
A trained instar neuron provides a response of 1 for data samples located near the training pattern (the bias controls the size of the neighborhood)
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 8
Competitive Rule for Winner-Takes-All Network The hard-limiter in the instar network acts as a gate that
controls the update of weights If a winner-takes-all network is used, the gating
mechanism is not needed, as the winning neuron is decided separately
Update rule
wj*i(n + 1) = wj*i(n) + η[xi(n) – wj*i(n)] j* = winning neuron
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 9
Criterion for Competition (1)
What criterion should be used to determine the winning neuron? The criterion should have some geometrical relevance
Inner product A neuron wins if the input is closest to the neuron (i.e. its weights) However, this only works when the input is normalized Winning neuron is the one with the largest inner product
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 10
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 11
Criterion for Competition (2)
Euclidean distance
||x – w|| = [Σi (xi – wi)2]1/2 Winning neuron is the one with the smallest Euclidean
distance Less efficient than inner product
L1 norm or Manhattan distance
|x – w| = Σi (xi – wi) Winning neuron is the one with the smallest L1 norm
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 12
Clustering
The competitive rule allows a single-layer linear network to group and represent data samples that lie in a neighborhood of the input space Each neighborhood is represented by a single output neuron The weights of the neuron represent the center of the cluster Thus, a single-layer linear competitive network performs
clustering
Clustering is a ‘divide and conquer’ process where an input space is divided into groups of proximal patterns
Vector quantization – an example of clustering
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 13
k-Means Clustering Algorithm (1)
Classical algorithm for data clustering Cluster data vectors xi (i = 1, N) into k cluster Ci (i = 1,
k) such that the distance between the vectors and their respective cluster centers ti (i = 1, k) is minimized
Algorithm Randomly assign vectors to class Ci
Find center of each cluster Ci
ti = (1/Ni) ΣnЄCi xn
Reassign the vectors to their nearest cluster, i.e., assign vector i to cluster k if ||xi – tk||2 is minimum for all k
Repeat the last two steps until no changes occur
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 14
k-Means Clustering Algorithm (2)
Performance measure or function
J = Σi=1 kΣnЄCi ||xn – ti||2 This measure is minimized in the k-means algorithm
Estimating J with its instantaneous value, then taking derivative wrt ti
δJ(n)/ δti = -η[x(n) – ti(n)]and, the change becomes (to minimize J)
Δti(n) = η[x(n) – ti(n)] This is identical to the competitive update rule, when ti
is replaced by wi Thus, the single-layer linear competitive network performs k-
means clustering of the input data
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 15
k-Means Clustering - Example
Cluster [0, 1, 3, 6, 8. 9, 10, 11] into 2 groups. N = 8, k = 2
C1 = [0, 1, 3, 6]; C2 = [8, 9, 10, 11] t1 = 10/4 = 2.5; t2 = 38/4 = 9.5 C1 = [0, 1, 3]; C2 = [6, 8, 9, 10, 11] t1 = 4/3 = 1.33; t2 = 44/5 = 8.8 C1 = [0, 1, 3]; C2 = [6, 8, 9, 10, 11]
No change. Thus, these are the final groupings and clusters
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 16
Determining the Number of Clusters
The number of clusters k (the number of output neurons) has to be decided apriori How is k determined ?
There is no easy answer without prior knowledge of the data that indicate the number of clusters If k is greater than the number of clusters in data, then
natural clusters are split into smaller regions If k is less than the number of clusters in data, then more than
one natural cluster is grouped together An iterative solution
Start with a small k, cluster and find value of J Increment k by 1, cluster and find value J Repeat until J starts to increase
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 17
Soft Competition
Hard competition = winner-takes-all. That is, the winner neuron is active while the rest are inactive
Soft competition = winner-shares-reward. That is, the winning neuron activates its neighboring neurons as well The winning neuron has the highest output, while its
neighboring neurons’ outputs decrease with increase in distance
Consequently, weights of all active neurons are modified, not just that of the winning neuron
Soft competition is common in biological neural systems
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 18
Kohonen Self-Organizing Maps (1)
Kohonen SOMs are single-layer competitive networks with a soft competition rule
They perform a mapping from a continuous input space to a discrete output space, preserving the topological properties of the input
Topological map points close to each other in the input space are mapped to
the same or neighboring neurons in the output space There is topological ordering or spatial significance to the
neurons in the output and input layer
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 19
Kohonen SOMs (2)
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 20
Kohonen SOMs (3)
1-D SOM has output neurons arranged as a string, with each neuron having two adjacent neighbors
2-D SOM has output neurons arranged as an 2-D array, with more flexible definition of neighborhoods
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 21
Kohonen Learning Algorithm (1)
Competitive rule
wj(n + 1) = wj(n) + hj,i*(n) η(n) [x(n) – wj(n)] hj,j* = neighborhood function for winning neuron i*
A common neighboring function is the Gaussian
hj,i*(n) = exp(- dj,i*2 / 2σ(n)2)
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 22
Kohonen Learning Algorithm (2)
Initialization: Choose random values for the initial weight vectors. Make sure the weights for each neuron are different and of small magnitude
Sampling and similarity matching: select an input vector x. Find the best-matching (winning) neuron at time n, using the minimum-distance Euclidean criterion.
Updating: adjust the weight vectors using the Kohonen learning rule
Continuation: Repeat the previous two steps until no noticeable changes occur in the feature map are observed
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 23
Example (1)
Create a 1-D feature map of a 2-D input space. The input vectors x lie on a unit circle.
No. of inputs = 2 (defined by the problem, 2-D) No. of outputs = 20 (our choice, 1-D linear array or
lattice) Randomly initialize the weights so that each neuron
has a different weight (in this example weights are initialized along the x-axis, i.e, the y component is zero)
See the MATLAB code for details
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 24
Example (2)
Consider the input x = [0.7071 0.7071]T Assume the current weights are W = [ 0 0:19] = [0 0; 0 1; 0 2; 0 3;…;0 19]
Winning neuron is the one with smallest norm from the input vector Winning neuron is 2 (weight vector w2 = [0 1]T) Norm = (0.70712 + 0.29292)1/2 = 0.7674 (this is the smallest
distance from the weight vectors) Update of weights
wj(n + 1) = wj(n) + hj,j*(n) η(n) [x(n) – wj(n)] Assume Gaussian neighborhood function with variance of
0.5
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 25
Example (3)
Neuron 1
w1(n + 1) = w1(n) + h1,2*(n) η(n) [x(n) – w1(n)]
= [0 0] + 0.3679([0.7071 0.7071] – [0 0])
= [0.2601 0.2601] Neuron 2 (winning neuron)
w2(n + 1) = w2(n) + h2,2*(n) η(n) [x(n) – w2(n)]
= [0 1] + (1)([0.7071 0.7071] – [0 1])
= [0.7071 0.7071]
CS/CMPE 537 - Neural Networks (Sp 2004/2005) - Asim Karim @ LUMS 26
Example (4)
Neuron 3
w3(n + 1) = w3(n) + h3,2*(n) η(n) [x(n) – w3(n)]
= [0 2] + 0.3679([0.7071 0.7071] – [0 2])
= [0.2601 1.5243] Neuron 4
w4(n + 1) = w4(n) + h4,2*(n) η(n) [x(n) – w4(n)]
= [0 3] + 0.0183([0.7071 0.7071] – [0 3])
= [0.0129 2.9580] Neuron 5 and beyond
There update will be negligible because the Gaussian tends to zero with increasing distance