self organising list

SELF ORGANIZING LIST

What is a self-organizing list?

It is a list that reorders elements based on some self-organizing heuristics to improve average access time.

The simplest implementation of self-organizing list is as a linked list.

Being efficient in random node inserting and memory allocation, it suffers from inefficient accesses to random nodes.

HISTORY

The concept of self-organizing list was introduced by McCabe in 1965.

In a pioneering work, he introduced two heuristics – the MTF rule and the transposition rule

Further improvements were made and algorithms suggested by Ronald Rivest, nemes ,D Knuth and so on.

AIM OF SELF ORGANIZING LIST

The aim of this list is to improve efficiency of linear search by moving more frequently accessed items towards the head of the list.

To improve average access time of nodes.

ANALYSIS OF RUNNING TIMES FOR ACCESS/SEARCH IN A LIST

THREE CASES ARE : AVERAGE CASE WORST CASE BEST CASE

AVERAGE CASE ANALYSIS Usually, the time taken is neither the

worst case time nor the best case time

Average case analysis assumes a probability distribution for the set of possible inputs and gives the expected time

Tavg=1*p(1)+2*p(2)+3*p(3)+…+n*p(n)Where p(i) is the probability of accessing the ith

element in the list,also called the access probability.

If the access probability of each element is same(i.e. p(1)=p(2)=…..p(n)=1/n),then the ordering of the elements is irrelevant and the average time complexity is given by:

T(n)=1/n +2/n +3/n+….n/n=(1+2+3…n)/n=(n+1)/2.

T(n) doesn’t depend on the individual access probabilities of the elements in the list in this case.

However in the case of searches on lists with non uniform record access probabilities, the average time complexity can be reduced drastically by the proper positioning of the elements contained in the list.

This is demonstrated in the next slide.

Given list: A(0.1), b(0.1),c(0.3),d(0.1),e(0.4) Without rearranging:- T(n)=1*0.1+2*0.1+3*0.3+4*0.1+5*0.4=3.6 Now nodes are arranged so that the nodes with

highest probability of access are placed closest to the front.

Now order is-> e(0.4),c(0.3),d(0.1).a(0.1),b(0.1) T(n)=1*0.4+2*0.3+3*0.1+4*0.1+5*0.1=2.2 Thus the average time required for searching in

an organized list is very less than the time required to search a randomly arranged list.

WORST AND BEST CASE

WORST CASE:In this case, the element Is located at the

very end of list,be it a normal list or a self-organized list and thus n comparisons must be made to reach it.

BEST CASE:In the best case, the element to be

searched is one which has been commonly accessed and has thus been identified by the list and kept at the head.

TECHNIQUES FOR REARRANGING NODES

DIFFERENT TYPES OF TECHNIQUES

MOVE TO FRONT(MTF)TRANSPOSECOUNT ORDERING

MOVE TO FRONT(MTF)

This technique moves the element which is accessed to the head of the list.

This has the advantage of being easily implemented and requiring no extra memory.

This heuristic also adapts quickly to rapid changes in the query distribution.

MTF TECHNIQUE

TRANSPOSE

This technique involves swapping an accessed node with its predecessor.

Therefore, if any node is accessed, it is swapped with the node in front unless it is the head node, thereby increasing its priority.

This algorithm is again easy to implement and space efficient and is more likely to keep frequently accessed nodes at the front of the list.

TRANSPOSE TECHNIQUE

COUNT

In this technique, the number of times each node was searched for is counted i.e. every node keeps a separate counter variable which is incremented every time it is called.

The nodes are then rearranged according to decreasing count.

Thus, the nodes of highest count i.e. most frequently accessed are kept at the head of the list.

COUNT TECHNIQUE

ORDERING TECHNIQUE

This technique orders a list with a particular criterion.

For e.g in a list with nodes 2,3,7,10. If we want to access node 5, then

the accessed node is added between 3 and 7 to maintain the order of the list i.e. increasing order.

DIFERENCE BETWEEN THESE TECHNIQUES

In the first three techniques i.e. move to front, transpose and count ,node is added at the end whereas in the ordering technique , node is added somewhere to maintain the order of the list.

21

EXAMPLE SHOWING ALL TECHNIQUES

Calculating cost

An inversion is defined to be a pair of elements (x,y) such that in one of the lists x precedes y and in the other list y precedes x. For example, the list [C, B, D, A] has four inversions with respect to the list [A, B, C, D], which are: (C,A), (B,A), (D,A), and (C,B). The amortized cost is defined to be the sum of the actual cost and the difference between the number of inversions before accessing an element and after accessing it,

amCost(x) = cost(x) + (inversionsBefore Access(x) – inversionsAfterAccess(x))

APPLICATIONS

Language translators like compilers and interpreters use self-organizing lists to maintain symbol tables during compilation or interpretation of program source code.

Currently research is underway to incorporate the self-organizing list data structure in embedded systems to reduce bus transition activity which leads to power dissipation in some circuits.

These lists are also used in artificial intelligence and neural networks as well as self-adjusting programs.

The algorithms used in self-organizing lists are also used as caching algorithms as in the case of LFU algorithm.

Conclusion

Given a set of search data, the optimal static ordering organizes the records precisely according to the frequency of their occurrence in search data. The frequency count and MTF technique in the long run are twice as costly as optimal static ordering and count technique approaches the cost of MTF technique. Although this can be imprecise as it depends on the data.

THANK YOU!

self organising list

Education