data structures and algorithms searching algorithms m. b. fayek cufe 2006

29
Data Structures and Data Structures and Algorithms Algorithms Searching Searching Algorithms Algorithms M. B. Fayek M. B. Fayek CUFE 2006 CUFE 2006

Upload: valerie-harrison

Post on 17-Jan-2018

234 views

Category:

Documents


1 download

DESCRIPTION

1. Introduction What is a Search? What is a Search? “ Searching is the task of finding a certain data item (record) in a large collection of such items.”  A key field that identifies the item sought for is given. (For simplification we consider only the key field instead of the complete record.)  If the item is found either its location or the complete item is returned.  If the item is not found an indication is given, usually by returning a non-existing index such as -1.

TRANSCRIPT

Page 1: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

Data Structures and Data Structures and Algorithms Algorithms

Searching Searching AlgorithmsAlgorithms

M. B. FayekM. B. FayekCUFE 2006CUFE 2006

Page 2: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

AgendaAgenda1.1. IntroductionIntroduction2.2. Sequential SearchSequential Search3.3. Binary SearchBinary Search4.4. Interpolation SearchInterpolation Search5.5. Indexed SearchIndexed Search

Page 3: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

1. Introduction1. Introduction What is a Search?What is a Search?

“ “ Searching is the task of finding a certain Searching is the task of finding a certain data item (record) in a large collection of data item (record) in a large collection of such items.” such items.”

A A key fieldkey field that identifies the item sought for that identifies the item sought for is given. (For simplification we consider only is given. (For simplification we consider only the key field instead of the complete record.) the key field instead of the complete record.)

If the item is If the item is foundfound either its location or the either its location or the complete item is returned.complete item is returned.

If the item is If the item is not foundnot found an indication is an indication is given, usually by returning a non-existing given, usually by returning a non-existing index such as -1.index such as -1.

Page 4: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

AgendaAgenda1.1. IntroductionIntroduction2.2. Sequential SearchSequential Search3.3. Binary SearchBinary Search4.4. Interpolation SearchInterpolation Search5.5. Indexed SearchIndexed Search

Page 5: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search Sequential Search is also called Sequential Search is also called

Exhaustive SearchExhaustive Search because the because the complete collection is searched.complete collection is searched.

Page 6: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search

15 13 6 20 21 17 8 41

NOYESKey item = list[i] ?

Keyitem = 20

i =0 i =1 i =2

return i =2 as found location !

Page 7: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search The first implementation will be: The first implementation will be:

for i = 0 to n dofor i = 0 to n doget next item Aiget next item Aiif Ai == k return iif Ai == k return i

endforendforreturn -1return -1

Page 8: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search Another pseudo code is: Another pseudo code is:

i =0i =0while i < n and item Ai <> kwhile i < n and item Ai <> ki <-- i+1i <-- i+1if i < n if i < n return ireturn ielse else return -1 return -1

Check boundary conditionsCheck boundary conditions! ! ←←

Page 9: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search How How is the algorithm implemented?is the algorithm implemented? The The way the collection is constructedway the collection is constructed

affects the way the next item Ai is retrieved.affects the way the next item Ai is retrieved. In a In a static arraystatic array: Ai is the indexed item A[i]: Ai is the indexed item A[i] In a In a linked listlinked list: Ai is the next node to be : Ai is the next node to be

fetched by following the “next pointer” in fetched by following the “next pointer” in the present node. In this case usually the the present node. In this case usually the address of the node found (a pointer to the address of the node found (a pointer to the found node) is returned or a NULL pointer found node) is returned or a NULL pointer to indicate that it was not found to indicate that it was not found

In a In a filefile: Ai is the next record retrieved from : Ai is the next record retrieved from the file the file

Page 10: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search ComplexityComplexity:: The The basic operationbasic operation is the is the comparisoncomparison For a collection of n data items there are For a collection of n data items there are

several several casescases:: Best caseBest case: item found at the first location: item found at the first location Number of comparisons = 1Number of comparisons = 1 Worst CaseWorst Case: item found at the last location : item found at the last location

or item not foundor item not found Number of comparisons = nNumber of comparisons = n Average caseAverage case = (1+n)/2 = (1+n)/2

Page 11: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search 2. Sequential Search EnhancementsEnhancements

Sequential Search may be Sequential Search may be enhanced using several enhanced using several techniques:techniques:

1.1. Sorting before searching Sorting before searching (Presorting)(Presorting)

2.2. Sentinel SearchSentinel Search3.3. Probabilistic SearchProbabilistic Search

Page 12: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search 2. Sequential Search EnhancementsEnhancements1. Presorting1. Presorting AA good questiongood question to ask before searching is to ask before searching is

whether the collection iswhether the collection is sorted sorted or notor not?? HowHow do we use that info? If do we use that info? If sortedsorted the search the search

is terminated as soon as the value of the is terminated as soon as the value of the indexed item in the collection exceeds that of indexed item in the collection exceeds that of the search item.the search item.

What is the What is the effecteffect?? This will not affect the This will not affect the worst case of finding the element at the last worst case of finding the element at the last position, but it will decrease the average position, but it will decrease the average number of comparisons if logic position of the number of comparisons if logic position of the item were somewhere before the end of the list item were somewhere before the end of the list and the element was not found.and the element was not found.

A more efficient search is the binary search.A more efficient search is the binary search.

Page 13: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search 2. Sequential Search EnhancementsEnhancements

2. Sentinel Search2. Sentinel Search The basic loop in sequential sort include 2 The basic loop in sequential sort include 2 comparisons at each iterationcomparisons at each iteration

while( while( (i< n)(i< n) && && (key < > A [ i ])(key < > A [ i ]) ) ) To decrease the number of comparisons to one To decrease the number of comparisons to one

per iteration a sentinel value = key is inserted per iteration a sentinel value = key is inserted at the end of the array (beyond its end, i.e. at n) at the end of the array (beyond its end, i.e. at n)

Hence the first comparison is redundant. The Hence the first comparison is redundant. The search will always stop finding key either within search will always stop finding key either within A (if it already existed) or outside A if it A (if it already existed) or outside A if it originally did not exist.originally did not exist.

A check on the location of key will indicate if it A check on the location of key will indicate if it existed or not. existed or not.

Page 14: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search 2. Sequential Search EnhancementsEnhancements

3. Probabilistic Search3. Probabilistic Search The basic idea here is that popular The basic idea here is that popular elements of the list that are searched for elements of the list that are searched for more frequently should require less more frequently should require less comparisons to findcomparisons to find

This is implemented by enhancing the This is implemented by enhancing the location of an element found in the array location of an element found in the array when searched for, one location ahead by when searched for, one location ahead by swapping it with the element before it.swapping it with the element before it.

Hence, each time an element is found the Hence, each time an element is found the number of comparisons needed to find it number of comparisons needed to find it next time is decremented by one next time is decremented by one

Page 15: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search Modifying the Modifying the firstfirst sequential algorithm sequential algorithm

for the case of sorted list would be :for the case of sorted list would be :for i = 0 to n dofor i = 0 to n doif Ai > k return -1 // as list is sorted if Ai > k return -1 // as list is sorted the the

// possible location has been // possible location has been passedpassed

if Ai == k return iif Ai == k return ireturn -1return -1

Page 16: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search Modifying the Modifying the secondsecond sequential sequential

algorithm for the case of sorted list would algorithm for the case of sorted list would be :be :i =0i =0while i < n and next item while i < n and next item Ai < kAi < ki <-- i+1i <-- i+1if if Ai == k and i < nAi == k and i < nreturn ireturn ielseelsereturn -1return -1

Page 17: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

AgendaAgenda1.1. IntroductionIntroduction2.2. Sequential SearchSequential Search3.3. Binary SearchBinary Search4.4. Interpolation SearchInterpolation Search5.5. Indexed SearchIndexed Search

Page 18: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

3. Binary Search3. Binary Search How How does it work? does it work? Basic ideaBasic idea that dividing the list at that dividing the list at

each search step into 2 sublists and each search step into 2 sublists and checking the mid itemchecking the mid item the range to the range to be searched for possible location is be searched for possible location is either the left or right sublist (i.e. either the left or right sublist (i.e. desreased to half ).desreased to half ).

NoteNote however, that the however, that the determination of the determination of the middle itemmiddle item in the collection is a simple task if the in the collection is a simple task if the data collection is represented in memory by a data collection is represented in memory by a sequential array, whereas it is not so if the collection sequential array, whereas it is not so if the collection is represented using a linked list. Hence we will is represented using a linked list. Hence we will assume that the collection is a sequential array.assume that the collection is a sequential array.

Page 19: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

2. Sequential Search2. Sequential Search

15 13 65 20 21 27 38 41

NOYESKey item = list[mid] ?

Keyitem = 20

n = 8 mid =4

return i =2 as found location !Key item < list[mid]

mid =2

3 comparisons!

mid =3

Key item > list[mid]

Page 20: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

3. Binary Search3. Binary Search For the For the same input and outputsame input and output specs as specs as

before before thethe algorithm is: algorithm is:

low = 0; high = n-1; low = 0; high = n-1; whilewhile (low < high) do (low < high) do{{ mid = (low+high)/2 mid = (low+high)/2

ifif ( k < A [mid] ) then high = mid -1 ( k < A [mid] ) then high = mid -1 else if else if ( k > A [mid] then low = mid +1( k > A [mid] then low = mid +1 elseelse return mid // found return mid // found }}return -1 // not foundreturn -1 // not found

Page 21: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

3. Binary Search3. Binary Search Complexity:Complexity: For a collection of n data items:For a collection of n data items: In each step: the mid item is compared to k In each step: the mid item is compared to k

and the range of search is divided by 2and the range of search is divided by 2 This is repeated until the range is zero (at This is repeated until the range is zero (at

the worst case).the worst case). i.e. we should i.e. we should askask: how many times will we : how many times will we

divide n by 2 till the length of sublists is divide n by 2 till the length of sublists is zero?zero?

→ → loglog22 n … which is better than n … which is better than nn

Page 22: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

AgendaAgenda1.1. IntroductionIntroduction2.2. Sequential SearchSequential Search3.3. Binary SearchBinary Search4.4. Interpolation SearchInterpolation Search5.5. Indexed SearchIndexed Search

Page 23: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

4.Interpolation Search4.Interpolation Search What What is meant by interpolation?is meant by interpolation? Here we try to Here we try to guess more preciselyguess more precisely

where the search key resides. where the search key resides. Instead of calculating the middle as Instead of calculating the middle as

the physical middle (low+high)/2 it is the physical middle (low+high)/2 it is calculated in a weighted manner w.r.t. calculated in a weighted manner w.r.t. to the value of k relative to max and to the value of k relative to max and min values in the listmin values in the list

)(][][

][ lowhighlowAhighA

lowAklowmid

Page 24: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

4. Interpolation Search4. Interpolation Search AnalysisAnalysis: : Calculations are more complex for midCalculations are more complex for mid Significant Improvement in search Significant Improvement in search

time especially when values of data time especially when values of data items in collection are evenly items in collection are evenly distributed.distributed.

Page 25: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

AgendaAgenda1.1. IntroductionIntroduction2.2. Sequential SearchSequential Search3.3. Binary SearchBinary Search4.4. Interpolation SearchInterpolation Search5.5. Indexed SearchIndexed Search

Page 26: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

5. Indexed Search5. Indexed Search WhatWhat is an index? is an index? Similar to the index of a book (e.g. Similar to the index of a book (e.g.

telephone book), items in the index point telephone book), items in the index point to significant items in the collection.to significant items in the collection.

This implies that in this search an This implies that in this search an additional table is used … the index additional table is used … the index table, where each item in the index table table, where each item in the index table points to a specific location in the points to a specific location in the original search list.original search list.

Page 27: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

5. Indexed Search5. Indexed Search Algorithm:Algorithm:// // InputInput: Search array A of n items + index table of d items : Search array A of n items + index table of d items

+ key item k+ key item k////OutputOutput: Location of item with search key or false key: Location of item with search key or false key

Step 1Step 1: : Determine search rangeDetermine search range for key for key within index table by specifying (within index table by specifying (iiminmin to i to imaxmax) ) inside original search list inside original search list

Step 2Step 2: : Search sequentiallySearch sequentially for key in for key in range (irange (iminmin to i to imaxmax) ) inside original search inside original search listlist

Page 28: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

5. Indexed Search5. Indexed Search Algorithm:Algorithm:

1111

22

3388

55

7711

88

9922

1111

44771111115511773388553366777711774488339922

Index

Table

Searching for key =53

{

0011223344556677889911001111

Pos

Step Step 11

Step Step 22

Pos = Pos = 55++11= 6= 6

1

Page 29: Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

5. Indexed Search5. Indexed Search Analysis: Assuming that: Analysis: Assuming that: the original table is of size nthe original table is of size n Index is of size dIndex is of size dStep 1Step 1: Determine search range has average : Determine search range has average

complexity:complexity:O( d/2)O( d/2)

Step 2Step 2: Search for key in : Search for key in range (irange (iminmin to i to imaxmax)) inside original search list, assume average inside original search list, assume average range length = n/krange length = n/k

)2/

2( dndOtygeComplexiTotalAvera