introduction to algorithms

Algorithms and Data Structures.

Searching and Sorting in JavaPetar Petrov10.12.2015

Algorithm Analysis Sorting Searching Data Structures

Table of Contents

An algorithm is a set of instructions to be followed to solve a problem.

Algorithm

Correctness Finiteness Definiteness Input Output Effectiveness

Properties

There are two aspects of algorithmic performance:

Time Space

Algorithmic Performance

First, we start to count the number of basic operations in a particular solution to assess its efficiency.

Then, we will express the efficiency of algorithms using growth functions.

Theoretical Analysis

Algorithm Growth Rates We measure an algorithm’s time

requirement as a function of the problem size.

The most important thing to learn is how quickly the algorithm’s time requirement grows as a function of the problem size.

An algorithm’s proportional time requirement is known as growth rate.

We can compare the efficiency of two algorithms by comparing their growth rates.

Each operation in an algorithm (or a program) has a cost. Each operation takes a certain of time.

count = count + 1; take a certain amount of time, but it is constant

A sequence of operations:

count = count + 1; Cost: c1

sum = sum + count; Cost: c2

Total Cost = c1 + c2

The Execution Time of Algorithms

Example: Simple If-StatementCost Times

if (n < 0) c1 1 absval = -n c2 1else

absval = n; c3 1

Total Cost <= c1 + max(c2,c3)


Example: Simple LoopCost Timesi = 1; c1 1sum = 0; c2 1while (i <= n) { c3 n+1i = i + 1; c4 nsum = sum + i; c5 n}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5 The time required for this algorithm is proportional to n


Example: Nested LoopCost Timesi=1; c1 1sum = 0; c2 1while (i <= n) { c3 n+1j=1; c4 nwhile (j <= n) { c5 n*(n+1) sum = sum + i; c6 n*n j = j + 1; c7 n*n } i = i +1; c8 n}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8 The time required for this algorithm is proportional to n2


Consecutive Statements If/Else Loops Nested Loops

General Rules for Estimation

Informal definitions:◦ Given a complexity function f(n),◦O(f(n)) is the set of complexity functions that are

upper bounds on f(n)◦(f(n)) is the set of complexity functions that are

lower bounds on f(n)◦(f(n)) is the set of complexity functions that,

given the correct constants, correctly describes f(n) Example: If f(n) = 17n3 + 4n – 12, then

◦ O(f(n)) contains n3, n4, n5, 2n, etc.◦ (f(n)) contains 1, n, n2, n3, log n, n log n, etc.◦ (f(n)) contains n3

Big-O and friends

Example: Simple If-StatementCost Times

if (n < 0) c1 1 absval = -n c2 1else

absval = n; c3 1

Total Cost <= c1 + max(c2,c3)

The Execution Time of AlgorithmsO(1)

Example: Simple LoopCost Timesi = 1; c1 1sum = 0; c2 1while (i <= n) { c3 n+1i = i + 1; c4 nsum = sum + i; c5 n}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5 The time required for this algorithm is proportional to n

The Execution Time of AlgorithmsO(n)

Example: Nested LoopCost Timesi=1; c1 1sum = 0; c2 1while (i <= n) { c3 n+1j=1; c4 nwhile (j <= n) { c5 n*(n+1) sum = sum + i; c6 n*n j = j + 1; c7 n*n } i = i +1; c8 n}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8 The time required for this algorithm is proportional to n2

The Execution Time of AlgorithmsO(n2)

Function Growth Rate Namec Constantlog N Logarithmiclog2N Log-squaredN LinearN log N LinearithmicN2 QuadraticN3 Cubic2N Exponential

Common Growth Rates

Comparison of Growth-Rate Functions

A Comparison of Growth-Rate Functions

The Sorting Problem

Input:

◦ A sequence of n numbers a1, a2, . . . , an

Output:

◦ A permutation (reordering) a1’, a2’, . . . , an’ of the

input sequence such that a1’ ≤ a2’ ≤ · · · ≤ an’

In-Place Sort◦ The amount of extra space required to sort the

data is constant with the input size.

In-Place

Stable

Sorted on first key:

Sort file on second key:

Records with key value 3 are not in order on first key!!

Stable sort ◦ preserves relative order of records with equal

keys

Insertion Sort Idea: like sorting a hand of playing cards

◦ Start with an empty left hand and the cards facing down on the table.

◦ Remove one card at a time from the table, and insert it into the correct position in the left hand

◦ The cards held in the left hand are sorted

Insertion Sort

6 10 24

12

36

To insert 12, we need to make room for it by moving first 36 and then 24.

Insertion Sort

6 10 24 36

12

Insertion Sort

6 10 12 24 36

insertionsort (a) { for (i = 1; i < a.length; ++i) { key = a[i] pos = i while (pos > 0 && a[pos-1] > key) { a[pos]=a[pos-1] pos--}a[pos] = key }

}

Pseudo-code

Insertion sort O(n2), stable, in-place O(1) space Great with small number of elements

Selection sort Algorithm:

◦ Find the minimum value◦ Swap with 1st position value◦ Repeat with 2nd position down

O(n2), stable, in-place

Bubble sort Algorithm

◦ Traverse the collection◦ “Bubble” the largest value to the end using

pairwise comparisons and swapping O(n2), stable, in-place Totally useless?

1. Divide: split the array in two halves

2. Conquer: Sort recursively both subarrays

3. Combine: merge the two sorted subarrays into a sorted array

Mergesort

mergesort (a, left, right) { if (left < right) {

mid = (left + right)/2mergesort (a, left, mid)mergesort (a, mid+1, right)merge(a, left, mid+1, right)

}}

Pseudo-code

Merging

The key to Merge Sort is merging two sorted lists into one, such that if you have two lists X (x1x2…xm) and Y(y1y2…

yn) the resulting list is Z(z1z2…zm+n) Example:L1 = { 3 8 9 } L2 = { 1 5 7 }merge(L1, L2) = { 1 3 5 7 8 9 }

Merging

3 10 23 54 1 5 25 75X: Y:

Result:

Merging

3 10 23 54 5 25 75

1

X: Y:

Result:

Merging

10 23 54 5 25 75

1 3

X: Y:

Result:

Merging

10 23 54 25 75

1 3 5

X: Y:

Result:

Merging

23 54 25 75

1 3 5 10

X: Y:

Result:

Merging

54 25 75

1 3 5 10 23

X: Y:

Result:

Merging

54 75

1 3 5 10 23 25

X: Y:

Result:

Merging

75

1 3 5 10 23 25 54

X: Y:

Result:

Merging

1 3 5 10 23 25 54 75

X: Y:

Result:

Merge Sort Example99 6 86 15 58 35 86 4 0

99 6 86 15 58 35 86 4 0

99 6 86 15 58 35 86 4 0

Merge Sort Example


99 6 86 15 58 35 86 4 0

86 1599 6 58 35 86 4 0


99 6 86 15 58 35 86 4 0

86 1599 6 58 35 86 4 0

99 6 86 15 58 35 86 4 0


99 6 86 15 58 35 86 4 0

86 1599 6 58 35 86 4 0

99 6 86 15 58 35 86 4 0

4 0

Merge Sort Example

99 6 86 15 58 35 86 0 4

4 0

Merge Sort Example

15 866 99 35 58 0 4 86

99 6 86 15 58 35 86 0 4

Merge Sort Example

6 15 86 99 0 4 35 58 86

15 866 99 58 35 0 4 86


6 15 86 99 0 4 35 58 86

Merge Sort Analysis

Merge Sort runs O (N log N) for all cases, because of its Divide and Conquer approach.

T(N) = 2T(N/2) + N = O(N logN)

1. Select: pick an element x

2. Divide: rearrange elements so that x goes to its final position

• L elements less than x• G elements greater than or equal to x

3. Conquer: sort recursively L and G

Quicksortx

x

x

L G

L G

quicksort (a, left, right) { if (left < right) {

pivot = partition (a, left, right)

quicksort (a, left, pivot-1)quicksort (a, pivot+1, right)

}}

Pseudo-code

How to pick a pivot?

How to partition?

Key steps

Use the first element as pivot◦ if the input is random, ok◦ if the input is presorted? - shuffle in advance

Choose the pivot randomly◦ generally safe◦ random numbers generation can be expensive

How to pick a pivot

Use the median of the array◦ Partitioning always cuts the array into half◦ An optimal quicksort (O(n log n))◦ hard to find the exact median (chicken-egg?)◦ Approximation to the exact median..

Median of three◦ Compare just three elements: the leftmost, the

rightmost and the center◦ Use the middle of the three as pivot

A better pivot

Given a pivot, partition the elements of the array such that the resulting array consists of:◦ One subarray that contains elements < pivot◦ One subarray that contains elements >= pivot

The subarrays are stored in the original array

How to partition

40 20 10 80 60 50 7 30 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

40 20 10 80 60 50 7 30 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]


1. while a[too_big_index] <= a[pivot_index]++too_big_index

40 20 10 80 60 50 7 30 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]



2. while a[too_small_index] > a[pivot_index]--too_small_index

40 20 10 80 60 50 7 30 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]




3. if too_big_index < too_small_indexswap a[too_big_index]a[too_small_index]

40 20 10 30 60 50 7 80 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]





40 20 10 30 60 50 7 80 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]





4. while too_small_index > too_big_index, go to 1.

40 20 10 30 7 50 60 80 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]





4. while too_small_index > too_big_index, go to 1.




4. while too_small_index > too_big_index, go to 1.5. swap a[too_small_index]a[pivot_index]

40 20 10 30 7 50 60 80 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]


7 20 10 30 40 50 60 80 100pivot_index = 4

[0] [1] [2] [3] [4] [5] [6] [7] [8]





4. while too_small_index > too_big_index, go to 1.5. swap a[too_small_index]a[pivot_index]

Quicksort Analysis Running time

◦ pivot selection: constant time, i.e. O(1)◦ partitioning: linear time, i.e. O(N)◦ running time of the two recursive calls

T(N)=T(i)+T(N-i-1)+cN where c is a constant◦ i: number of elements in L

Worst-Case Analysis What will be the worst case?

◦ The pivot is the smallest element, all the time◦ Partition is always unbalanced

Best-case Analysis What will be the best case?

◦ Partition is perfectly balanced.◦ Pivot is always in the middle (median of the

array)

Using Java API Sorting Methods Java API provides a class Arrays with

several overloaded sort methods for different array types

Class Collections provides similar sorting methods

Java API Sorting InterfaceArrays methods: public static void sort (int[] a)

public static void sort (Object[] a) // requires Comparable

public static <T> void sort (T[] a, Comparator<? super T> comp) // uses given Comparator

Java API Sorting InterfaceCollections methods: public static <T extends Comparable<T>> void sort (List<T> list)

public static <T> void sort (List<T> l, Comparator<? super T> comp)

Searching

Searching Given the collection and an element

to find… Determine whether the “target”

element was found in the collection◦ Print a message◦ Return a value

(an index or pointer, etc.) Don’t modify the collection in the

search!

Linear Search A search traverses the collection until

◦ the desired element is found◦ or the collection is exhausted

linearsearch (a, key) {for (i = 0; i < a.length; i++) {

if (a[i] == key) return i}return –1

}

Pseudo-code

Linear Search

40 20 10 30 7

Search for 20

Linear Search

40 20 10 30 7

Search for 20

40 != 20

Linear Search

40 20 10 30 7

Search for 20

20 = 20

Linear Search

40 20 10 30 7

Search for 20

20 = 20return 1

Linear Search

40 20 10 30 7

Search for 5

Linear Search

40 20 10 30 7

Search for 5

40 != 5

Linear Search

40 20 10 30 7

Search for 5

20 != 5

Linear Search

40 20 10 30 7

Search for 5

10 != 5

Linear Search

40 20 10 30 7

Search for 5

30 != 5

Linear Search

40 20 10 30 7

Search for 5

7 != 5return -1

Linear Search O(n) Examines every item

Locates a target value in a sorted array/list by successively eliminating half of the array on each step

Binary Search

binarysearch (a, low, high, key) {while (low <= high) {mid = (low+high) >>> 1midVal = a[mid]if (midVal < key) low=mid+1else if (midVal > key) high=mid+1else return mid}return –(low + 1)

}

Pseudo-code

Binary Search

3 4 6 7

Search for 4

8 10 13 141

Binary Search

3 4 6 7

Search for 4

8 10 13 141

left right

Binary Search

3 4 6 7

Search for 4

8 10 13 141

4 < 7left right

Binary Search

3 4 6 7

Search for 4

8 10 13 141

left right

Binary Search

3 4 6 7

Search for 3

8 10 13 141

4 > 3left right

Binary Search

3 4 6 7

Search for 4

8 10 13 141

leftright

Binary Search

3 4 6 7

Search for 4

8 10 13 141

4 = 4leftright

Binary Search

3 4 6 7

Search for 4

8 10 13 141

return 4leftright

Binary Search

3 4 6 7

Search for 9

8 10 13 141

Binary Search

3 4 6 7

Search for 9

8 10 13 141

left right

Binary Search

3 4 6 7

Search for 9

8 10 13 141

9 > 7left right

Binary Search

3 4 6 7

Search for 9

8 10 13 141

left right

Binary Search

3 4 6 7

Search for 9

8 10 13 141

9 < 10left right

Binary Search

3 4 6 7

Search for 9

8 10 13 141

left right

Binary Search

3 4 6 7

Search for 9

8 10 13 141

9 > 8left right

Binary Search

3 4 6 7

Search for 9

8 10 13 141

leftright

right < leftreturn -7

Binary search Requires a sorted array/list O(log n) Divide and conquer

Collection

List

Set

SortedSet

Map

SortedMap

LinkedList ArrayList

HashSet

TreeSet

HashMap

TreeMap

Extends

ImplementsInterface

Class

Collections and Map API

Set◦ The familiar set abstraction. ◦ No duplicates; May or may not be ordered.

List◦ Ordered collection, also known as a sequence. ◦ Duplicates permitted; Allows positional access

Map◦ A mapping from keys to values. ◦ Each key can map to at most one value (function).

Interfaces

Collections FrameworkImplementations

Set List Map

HashSet ArrayList HashMap

LinkedHashSet LinkedList LinkedHashMap

TreeSet Vector Hashtable

TreeMap

Collections Characteristics Ordered

◦ Elements are stored and accessed in a specific order

Sorted◦ Elements are stored and accessed in a sorted

order Indexed

◦ Elements can be accessed using an index Unique

◦ Collection does not allow duplicates

Linked List

A linked list is a series of connected nodes

Each node contains at least◦ A piece of data (any type)◦ Pointer to the next node in the list

Head: pointer to the first node The last node points to NULL

A

Head

B C

A

data

pointer

node

Insert Element

A

Head

B C

D

x

A

Head

B CD

Remove Element

A

Head

B C

x

Head

B C

Find, Access Element

A

Head

B C

Doubly-Linked List

A

Head

B C

A

data

next

node

previous

Tail

Operations ComplexityOperation Complexityinsert at beginning O(1)Insert at end O(1)Insert at index O(n)delete at beginning O(1)delete at end O(1)delete at index O(n)find element O(n)access element by index O(n)

Array List

Resizable-array implementation of the List interface

capacity vs. size

A B C

Insert Element At End

A B C

A B C D

A B C D E

D

capacity > size

capacity = size

Insert, Remove, Find, Access

A B C

Operations ComplexityOperation Complexityinsert at beginning O(n)Insert at end O(1) amortizedInsert at index O(n)delete at beginning O(n)delete at end O(1)delete at index O(n)find element O(n)access element by index O(1)

Stacks and Queues Some collections are constrained so clients

can only use optimized operations◦ stack: retrieves elements in reverse order as

added◦ queue: retrieves elements in same order as

added

stack

queue

top 32

bottom 1

pop, peekpush

front back

1 2 3 addremove, peek

stack: A collection based on the principle of adding elements and retrieving them in the opposite order.

basic stack operations:◦ push: Add an element to the top.◦ pop: Remove the top element.◦ peek: Examine the top element.

Stacks

stack

top 32

bottom 1

pop, peekpush

Programming languages and compilers:◦ method call stack

Matching up related pairs of things:◦ check correctness of brackets (){}[]

Sophisticated algorithms:◦ undo stack

Stacks in computer science

queue: Retrieves elements in the order they were added.

basic queue operations:◦ add (enqueue): Add an element to the back.◦ remove (dequeue): Remove the front element.◦ peek: Examine the front element.

Queues

queue

front back

1 2 3 addremove, peek

Operating systems:◦ queue of print jobs to send to the printer

Programming:◦ modeling a line of customers or clients

Real world examples:◦ people on an escalator or waiting in a line◦ cars at a gas station

Queues in computer science

Map

A data structure optimized for a very specific kind of search / access

In a map we access by asking "give me the value associated with this key."

capacity, load factor

A -> 65

Hash Function

“Ivan Ivanov"

555389085

[email protected]

5122466556 12hashfunction

“Ivan"

5/5/1967

Implements Map Fast put, get operations hashCode(), equals()

Hash Map

Hash Map

0

1

2

3

4

5

key=“BG”

2117hashCode()

%6

5(“BG”, “359”)

Handling Collisions What to do when inserting an element and

already something present?

Open Address Hashing Could search forward or backwards for an

open space Linear probing

◦ move forward 1 spot. Open?, 2 spots, 3 spots Quadratic probing

◦ 1 spot, 2 spots, 4 spots, 8 spots, 16 spots Resize when load factor reaches some limit

Chaining Each element of hash table be another data

structure◦ LinkedList◦ Balanced Binary Tree

Resize at given load factor or when any chain reaches some limit

Implements Map Sorted Easy access to the biggest logarithmic put, get Comparable or Comparator

Tree Map

Binary trees 0, 1, or 2 children per node Binary Search Tree

◦ node.left < node.value ◦ node.right >= node.value

Priority Queues 154

A priority queue stores a collection of entries Main methods of the Priority Queue ADT

◦ insert(k, x)inserts an entry with key k and value x

◦ removeMin()removes and returns the entry with smallest key

Priority Queue

A heap can be seen as a complete binary tree:

Heaps

16

14 10

8 7 9 3

2 4 1

A heap can be seen as a complete binary tree:

Heaps

16

14 10

8 7 9 3

2 4 1 1 1 111

Heaps In practice, heaps are usually implemented

as arrays:

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 1 =0

Heaps To represent a complete binary tree as an

array: ◦ The root node is A[1]◦ Node i is A[i]◦ The parent of node i is A[i/2] (note: integer divide)◦ The left child of node i is A[2i]◦ The right child of node i is A[2i + 1]

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 1 =0

Heapify() Example

16

4 10

14 7 9 3

2 8 1

16 10 14 7 9 3 2 8 140

Heapify() Example

16

4 10

14 7 9 3

2 8 1

16 10 7 9 3 2 8 14 140

Heapify() Example

16

14 10

4 7 9 3

2 8 1

16 14 10 4 7 9 3 2 8 10

Heapify() Example

16

14 10

4 7 9 3

2 8 1

16 14 10 7 9 3 2 14 80

Heapify() Example

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 10

Heap sort

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 10

Collections Toolbox java.util.Collections

java.util.Arrays exports similar basic operations for an array.

binarySearch(list, key)sort(list)min(list)max(list)reverse(list)shuffle(list)swap(list, p1, p2) replaceAll(list, x1, x2)

Finds key in a sorted list using binary search. Sorts a list into ascending order.Returns the smallest value in a list.Returns the largest value in a list.Reverses the order of elements in a list.Randomly rearranges the elements in a list.Exchanges the elements at index positions p1 and p2. Replaces all elements matching x1 with x2.

Questions&&Answers

introduction to algorithms

Education