getting to the bottom, fast

15
Getting to the bottom, fast Prof. Ramin Zabih http://cs100r.cs.cornell.edu

Upload: tivona

Post on 18-Mar-2016

55 views

Category:

Documents


0 download

DESCRIPTION

Getting to the bottom, fast. Prof. Ramin Zabih http://cs100r.cs.cornell.edu. Administrivia. Assignment 4 is due on Friday Quiz 6 will be Thursday Oct 25 Prelim schedule In class exams, 30 minutes P2: Thursday Nov 1 P3: Thursday Nov 29 (last lecture…) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Getting to the bottom, fast

Getting to the bottom, fast

Prof. Ramin Zabih

http://cs100r.cs.cornell.edu

Page 2: Getting to the bottom, fast

2

Administrivia Assignment 4 is due on Friday Quiz 6 will be Thursday Oct 25 Prelim schedule

– In class exams, 30 minutes– P2: Thursday Nov 1– P3: Thursday Nov 29 (last lecture…)

Please start thinking about your final project!– Due sometime during finals week (date TBA)

Page 3: Getting to the bottom, fast

3

2nd-to-last Convex Hull Fact Convex hull is closely linked to sorting You can sort the numbers xi by computing

the convex hull of the points (xi, xi2 )

– This implies that a fast method for convex hull gives you a fast method for sorting

– Fact: you can’t do comparison-based sort in linear time• Proof is sometimes taught in CS280

– Consequence: you can’t do convex hull in linear time either!

Page 4: Getting to the bottom, fast

4

Why is an error function hard? An error function where we can get stuck if

we roll downhill is a hard one– Where we get stuck depends on where we start

(i.e., initial guess/conditions)– An error function is hard if the area “above it”

has a certain shape• Nooks and crannies• In other words, non-CONVEX!

– Non-convex error functions are hard to minimize

Page 5: Getting to the bottom, fast

5

More general problem Suppose we have a convex function f(x)

– 1D, for simplicity How do we find the minimum x*?

This is an extremely important task– For example, for various physics problems

We showed you a dumb way to do this– Hillclimbing: start somewhere, change your

guess a little, see if it improves Can we do something smarter?

Page 6: Getting to the bottom, fast

6

Interval strategy Many algorithms “bracket” the minimum

– Maintain an interval [xlow, xhigh] such that the minimum x* lies somewhere inside• We will call such an interval valid

– Shrink the interval steadily To start off, we need an initial interval

– For simplicity, we will pretend the minimum happens at some positive value (i.e., x* > 0)

– So our initial interval is [0, ???] Need an initial xhigh > x*, i.e. to the right of

the minimum. How can we find one?

Page 7: Getting to the bottom, fast

7

Finding derivatives Suppose we can tell something about the

slope of f – In other words, the derivative f’

Suppose we just know the sign of f’– I.e., we know f is increasing/decreasing– We know:– Derivative sign tells us if we are to the left or

right of the answer Approach (standard CS “hallucination”)

– Pretend we know the sign of f’– Figure out how to compute the sign of f’

Page 8: Getting to the bottom, fast

8

Computing the initial interval We can find an initial choice of xhigh by

moving right until the derivative has the correct sign– I.e., f is increasing: f’ ≥ 0– As we will see, the best way to do this is by

repeated doubling• Try 1, then 2, then 4, then 8, etc.• This will get us a valid xhigh fast, though it might

be very large For instance, if x* = 1025, xhigh= 2048

• This will turn out not to matter!

Page 9: Getting to the bottom, fast

9

Reducing the interval We have a valid interval [xlow, xhigh]

– How do we create a smaller valid interval?– Consider the midpoint:– The two intervals [xlow, xmid] and [xmid, xhigh] are

half the size– One of them is valid!

• Can you prove this?

Page 10: Getting to the bottom, fast

10

Binary search in action

1.8 1.9 2 2.1 2.22.5

3

3.5

4

4.5

5

5.5

m

sum

of s

quar

ed e

rrors

Page 11: Getting to the bottom, fast

11

Binary search speed The basic operation is very fast

– Each iteration halves the size of the interval!• This is why it’s OK to start with a big interval

We need to evaluate the function at a relatively small number of places– Reducing the number of function evaluations is

the name of the game– We also need to evaluate the derivative

• Which is a bit more work, though not hard

Page 12: Getting to the bottom, fast

12

Evaluating the derivative To compute the derivative f’ we recall:

We can approximate this by using a suitably small value of h– Details in CS322/CS421

Page 13: Getting to the bottom, fast

13

More binary search This technique is quite general, and useful

for lots of problems besides minimization– We’ll need it for images when we look into

recognizing colored objects Here is a nice (non-image) example

– Suppose you have many versions of a file, that lots of people are editing• Think of a big business document (or program)

– You’d like to know who introduced some text• Often, you’re looking for a bug…

Page 14: Getting to the bottom, fast

14

Binary search example We want the earliest document where a

certain change was made If we are going to make many queries like

this, it is worth spending some effort at the beginning to make the later ones fast

This is a incredibly common technique– Pay up front, reap the rewards later

We will simply sort the documents by date– Sorting, in CS, is often the answer

Page 15: Getting to the bottom, fast

15

Using binary search With the documents in order, we can find

the earliest one with the text we want via binary search– The last document has the new text, the first

document does not: interval = [first, last]– What about the middle document?– If it has the new text, we can look in the

interval [first, middle]– If it doesn’t have the new text, we can look in

the interval [middle+1, last]