Dijkstra’s Algorithm
Keep Going!
Pre-Computing Shortest Paths
• How many paths to pre-compute?• Recall:
– Using single-source to single-dest find_path:• Need any delivery location to any other travel time:
– N * (N-1) 2450 calls for N = 50
• Plus any depot to any delivery location– M * N 500 calls for N = 50, M = 10
• Plus any delivery location to any depot– N * M 500 calls
– Total: 3450 calls to your find_path
Pre-Computing Travel Time Paths
• Using single-source to all destinations– Need any delivery location to any other
• N calls 50
– Plus any depot to any delivery location• M calls 10
– Plus any delivery location to any depot• 0 calls
– Total: 60 calls
• Is this the minimum?– No, with small change can achieve: 51 calls
Get from earlier
call
Is This Fast Enough?
• Recall:– Dijkstra’s algorithm can search whole graph– Especially with multiple destinations– O(N) items to put in wavefront– Using heap / priority_queue:
• O (log N) to add / remove 1 item from wavefront
• Total:– N log N– Can execute in well under a second– OK!
Escaping Local Minima
Revisited
Say We’re In This State
0
1
2
3
4
5
6
7
8
9
Local perturbation to improve?
deliveryOrder = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Swap Order of Two Deliveries?
0
1
2
3
4
5
6
7
8
9
deliveryOrder = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}deliveryOrder = {0, 1, 3, 2, 4, 5, 6, 7, 8, 9}
No swap of two deliveries can improve!Stuck in a local minimum
2-Opt?
0
1
2
3
4
5
6
7
8
9
Path cut into 3 pieces
deliveryOrder = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
2-Opt?
0
1
2
3
4
5
6
7
8
9
Reconnected: worse!
deliveryOrder = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}deliveryOrder = {0, 1, 2, 6, 5, 4, 3, 7, 8, 9}
2-Opt?
0
1
2
3
4
5
6
7
8
9
Reconnected differently: now better!
deliveryOrder = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}deliveryOrder = {0, 1, 2, 7, 8, 9, 6, 5, 4, 3}
Perturbations & Local Minima
• Explore lots of local perturbations– Compute travel time for each– To see what’s better
• Escape local minima with more powerful perturbations
• And/or use high climbing• Powerful unifying technique
– Simulated annealing– Lots of high climbing early– Little later (metal has cooled)
How Do I Finish by the Time Limit?#include <time.h>#define TIME_LIMIT 30 // m4: 30 second time limit
int main ( ) { clock_t startTime = clock (); // Clock “ticks”
do { myOptimizer (); clock_t currentTime = clock (); float timeSecs = ((float) (currentTime – startTime)) / CLOCKS_PER_SEC;
// Keep optimizing until within 10% of time limit } while (timeSecs < 0.9 * TIME_LIMIT);
...}
Algorithm Challenge
Algorithms: Challenge Question
• Frank likes what he calls “cool” numbers• For cool numbers, there are integers x and y such that
– Cool number = x2 = y3 – For example, 1 is cool (= 12 = 13) and 64 is cool (= 82 = 43)– 25 is not cool (= 52, but no integer cubed = 25)
1. Write a program to print all cool numbers between 1 and N
2. Calculate the computational complexity of your program
3. Mail me program & complexity: first 5 of lowest complexity chocolate bar in class Fri.
• Source: ACM Programming Competition
Multithreading
Why & How
Intel 8086
• First PC microprocessor• 1978• 29,000 transistors• 5 MHz• ~10 clocks / instruction• ~500,000
instructions / s
Intel Core i7• 2011• 1 billion transistors• 3.5 GHz• ~15 clocks / instruction, but ~30 instructions in flight at once
Average about 2 instructions completed / clock
• Can execute ~7 billion instructions / s
1978 to 2011
• 35,000x more transistors• ~14,000x more instructions / s• The future:
– Still getting 2X the transistors every 2 years– But transistors not getting much faster
• Clock speed saturating
– ~30 instructions in flight• Complexity & power to go beyond this climbs rapidly• Slow growth in instructions / cycle
– Impact: CPU speed not increasing as rapidly • Using multiple processors (cores) now important• Multithreading: one program using multiple cores at once
A Single-Threaded Program
Instructions (code)
Memory
Global Variables
Heap Variables
(new)
Stack(local
variables)
. . .
Program Counter
Stack Pointer
CPU / Core
A Multi-Threaded Program
Instructions (code)
Memory
Global Variables
Heap Variables
(new)
Stack1(local
variables)
. . .
Program Counter
Stack Pointer
Core1
Program Counter
Stack Pointer
Core2
Stack2(local
variables)
thread 1
thread 2
Shared by all
threads
Each thread
gets own local
variables
Thread Basics
• Each thread has own program counter– Can be executing a different function– Is (almost always) executing a different
instruction from other threads
• Each thread has own stack– Has its own copy of local variables (all
different)
• Each thread sees same global variables• Dynamically allocated memory
– Shared by all threads– Any thread with a pointer to it can access
Implications
• Threads can communicate through memory– Global variables– Dynamically allocated memory– Fast communication!
• Must be careful threads don’t conflict in reads/write to same memory– What if two threads update the same global
variable at the same time?– Not clear which update wins!
• Can have more threads than CPUs– Time share the CPUs