hashing project 1. searching data structures consider a set of data with n data items stored in some...

Post on 05-Jan-2016

221 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

HASHINGPROJECT

2

SEARCHING DATA STRUCTURES•Consider a set of data with N data items stored in some data structure

•We must be able to insert, delete & search for items

•What are possible ways to do this? What is the complexity of each structure & method ?

3

DATA STRUCTURES

•Unsorted Array

•Sorted Array

•Linked List

•Binary Search Tree

•Heap

•What are advantages & disadvantages of each?

•Time complexities of each?

•What about memory requirements?

4

SOFTWARE DEVELOPMENT

•Ask THESE questions – always!

•What if a data structure already exists & you MUST use it?

•Must be able to use most efficiently.

5

HASHING

•Technique for data storage & retrieval having constant time

•Do you believe it???

•Perfect Hashing fits this definition, but not hashing in general

6

HASHING•A storage & retrieval technique in which a data item (key) is converted to the address in which it will be stored. The same conversion is used to retrieve the data.

•Example – MSU M-number – 8 digits – index into an array

•Phone numbers?

•Problem?

7

HASH FUNCTION•Mathematical operation which converts a search key into a hash table address

•Modulo functions is OFTEN used as part of the hash function

•Examples:

•M-number ~ Table size

8

COLLISION•A collision occurs when 2 different search keys hash to the same table address.

•Collision Resolution Policy (CRP) – strategy for selecting an alternate location for the hashed item that cannot be placed in the computed table address

•CRP – affects the complexity of the hashing process

•Examples

9

OPEN ADDRESSING - CRP•Select an alternate location in the table

•Linear Probing – beginning at original hash location, sequentially search the table for available location. {+1}

•Incremental probing - use a value other than 1

•Double Hashing – use a second function to determine the probe increment {+f(n)}

10

OTHER CRP

•Bucket Hashing – Each hash address is actually a set of table locations

•Chaining – a linked list at each hash address contains all keys that hash there

•Table format for each?

11

TABLE SIZE??

•How big should a hash table be?

•How full should the table get?

•Implications of table size?

12

MEASURING HASH PERFORMANCE•Hash Function Complexity?

•Probes: number of hash table locations “probed” (checked) before finding an empty location (CRP)

•Consider AVERAGE for a large data set

•Table size: want smallest that provides few probes

13

OUR SEMESTER PROJECT – HASHING •Analysis

•Empirical Studies

•Table Sized

•CRP’s

•Functions

top related