11 chapter 2: process/thread instructor: hengming zou, ph.d. in pursuit of absolute simplicity...
TRANSCRIPT
11
Chapter 2: Process/ThreadInstructor: Hengming Zou, Ph.D.
In Pursuit of Absolute Simplicity 求于至简,归于永恒
22
Content
Processes
Threads
Inter-process communication
Classical IPC problems
Scheduling
33
Definition of A Process
Informal– A program in execution
– A running piece of code along with all the things the program can read/write
Formal– One or more threads in their own address space
Note that process != program
44
The Need for Process
What is the principle motivation for inventing process?– To support multiprogramming
55
The Process Model
Conceptual viewing of the processes
Concurrency– Multiple processes seem to run concurrently
– But in reality only one active at any instant
Progress– Every process makes progress
66
Programs in memory Conceptual view Time line view
Multiprogramming of 4 programs
77
Process Creation
Principal events that cause process creation
1. System initialization
2. Execution of a process creation system
3. User request to create a new process
88
Process Termination
Conditions which terminate processes
1. Normal exit (voluntary)
2. Error exit (voluntary)
3. Fatal error (involuntary)
4. Killed by another process (involuntary)
99
Process Hierarchies
Parent creates a child process
Child processes can create its own process
Processes creation forms a hierarchy– UNIX calls this a "process group"
– Windows has no concept of such hierarchy, i.e. all processes are created equal
1010
Process States
Running
Blocked Ready
Possible process states– running, blocked, ready
Process blocks for input
Scheduler picks another process
Input becomes available
Scheduler picks this process
1111
Process Space
Also called Address Space
All the data the process uses as it runs
Passive (acted upon by the process)
Play analogy: – all the objects on the stage in a play
1212
Process Space
Is the unit of state partitioning– Each process occupies a different state of the computer
Main topic: – How multiple processes spaces can share a single physical memory efficiently and safely
1313
Manage Process & Space
Who manages process & space?– The operating systems
How does OS achieve it?– By maintain information about processes
– i.e. use process tables
1414
Fields of A Process Table
Registers, Program counter, Status word
Stack pointer, Priority, Process ID
Parent group, Process group, Signals
Time when process started
CPU time used, Children’s CPU time, etc.
1515
Problems with Process
While supporting multiprogramming on shared hardware
Itself is single threaded!– i.e. a process can do only one thing at a time
– blocking call renders entire process unrunnable
1616
Threads
Invented to support multiprogramming on process level
Manage OS complexity– Multiple users, programs, I/O devices, etc.
Each thread dedicated to do one task
1717
Thread
Sequence of executing instructions from a program – i.e. the running computation
Play analogy: one actor on stage in a play
1818
Threads
Processes decompose mix of activities into several parallel tasks (columns)
Each job can work independently of others
job1 job2 job3
Thread 1 Thread 2 Thread 3
1919
The Thread Model
3 processes each with 1 thread
1 process with 3 threads
Kernel
Thread
Proc 1 Proc 2 Proc 3
Userspace
Kernelspace
Kernel
Thread
Process
2020
The Thread Model
Some items shared by all threads in a process
Some items private to each thread
2121
Shared and Private Items
Per process items Per thread items
Address space Program counter
Global variables Registers
Open files Stack
Child processes State
Pending alarms
Signals and signal handlers
Accounting information
2222
Shared and Private Items
Each thread has its own stack
2323
Input thread
Backup thread
Display thread
A Word Processor w/3 Threads
2424
Implementation of Thread
How many options to implement thread?
Implement in kernel space
Implement in user space
Hybrid implementation
2525
Kernel-Level Implementation
Completely implemented in kernel space
OS acts as a scheduler
OS maintains information about threads– In additions to processes
2626
Kernel-Level Implementation
2727
Kernel-Level Implementation
Advantage:– Easier to programming
– Blocking threads does not block process
Problems:– Costly: need to trap into OS to switch threads
– OS space is limited (for maintaining info)
– Need to modifying OS!
2828
User-Level Implementation
Completely implemented in user space
A run-time system acts as a scheduler
Threads voluntarily cooperate– i.e. yield control to other threads
OS need not know existence of threads
2929
User-Level Implementation
3030
User-Level Implementation
Advantage:– Flexible, can be implemented on any OS
– Faster: no need to trap into OS
Problems:– Programming is tricky
– blocking threads block process!
3131
User-Level Implementation
How do we solve the problem of blocking thread blocks the whole process?
Modifying system calls to be unblocking
Write a wrap around blocking calls– i.e. call only when it is safe to do so
3232
Scheduler Activations
A technique that solves the problem of blocking calls in user-level threads
Method: use up-call
Goal – mimic functionality of kernel threads– gain performance of user space threads
3333
Scheduler Activations
Kernel assigns virtual processors to process
Runtime sys. allocate threads to processors
Blocking threads are handled by OS upcall– i.e. OS notify runtime system about blocking calls
3434
Scheduler Activations
Problem:
Reliance on kernel (lower layer) calling procedures in user space (higher layer)
Violates layered structure of OS design
3535
Hybrid Implementation
Can we have the best of both worlds– i.e. kernel-level and user-level implementations
While avoiding the problems of either?
Hybrid implementation
3636
Hybrid Implementation
User-level threads are managed by runtime systems
Kernel-level threads are managed by OS
Multiplexing user-level threads onto kernel- level threads
3737
Hybrid Implementation
3838
Multiple Threads
Can have several threads in a single address space– That is what thread is invented for
Play analogy: several actors on a single set– Sometimes interact (e.g. dance together)
– Sometimes do independent tasks
3939
Multiple Threads
Private state for a thread vs. global state shared between threads– What private state must a thread have?
– Other state is shared between all threads in a process
4040
Multiple Threads
Per process items Per thread items
Address space Program counter
Global variables Registers
Open files Stack
Child processes State
Pending alarms
Signals and signal handlers
Accounting information
4141
Multiple Threads
Many programs are written in single-threaded processes
Make them multithreaded is very tricky– Can cause unexpected problems
4242
Multiple Threads
Conflicts between threads over the use of a global variable
4343
Multiple Threads
Many solutions:
Prohibit global variables
Assign each thread its private global variables
4444
Multiple Threads
Threads can have
private global variables
4545
Cooperating Threads
Often we create threads to cooperate
Each thread handles one request
Each thread can issue a blocking disk I/O, – wait for I/O to finish
– then continue with next part of its request
4848
Cooperating Threads
Ordering of events from different threads is non-deterministic
– e.g. after 10 seconds, different threads may have gotten differing amounts of work done
thread A --------------------------------->
thread B - - - - >
thread C - - - - - - - - - - - >
4949
Cooperating Threads
Example– thread A: x=1
– thread B: x=2
Possible results?
Is 3 a possible output?– yes
5050
Cooperating Threads
3 is possible because assignment is not atomic and threads can interleave
if assignment to x is atomic– then only possible results are 1 and 2
5151
Atomic: indivisible– Either happens in its entirety without interruption or has yet to happen at all
No events from other threads can happen in between start & end of an atomic event
Atomic Operations
5252
Atomic Operations
On most machines, memory load & store are atomic
But many instructions are not atomic– double-precision floating on a 32-bit machine
– two separate memory operations
5353
Atomic Operations
If you don’t have any atomic operations– you can’t make one
Fortunately, the hardware folks give us atomic operations– and we can build up higher-level atomic primitives from there
5454
Atomic Operations
Another example:– thread A thread B
– i=0 i=0
– while (i < 10) { while (i > -10) {
– i++ i--
– } }
– print “A finished” print “B finished”
Who will win?
5555
Atomic Operations
Is it guaranteed that someone will win?
What if threads run at exactly the same speed and start close together?
Is it guaranteed that it goes on forever?
5656
Atomic Operations
Arithmetic example– (initially y=10)
– thread A: x = y + 1;
– thread B: y = y * 2;
Possible results?
5757
Thread Synchronization
Must control the inter-leavings between threads– Or the results can be non-deterministic
All possible inter-leavings must yield a correct answer
5858
Thread Synchronization
Try to constrain the thread executions as little as possible
Controlling the execution and order of threads is called “synchronization”
5959
Gold Fish Problem
Problem definition:
Tracy and Peter want to keep a gold fish alive– By feeding the fish properly
if either sees that the fish is not fed,– she/he goes to feed the fish
The fish must be fed once and only once each day
6060
Correctness Properties
Someone will feed the fish if needed
But never more than one person feed the fish
6161
Solution #0
No synchronization– Peter: Tracy:
– if (noFeed) { if (noFeed) {
– feed fish feed fish
– } }
6262
Problem with Solution #0
Peter Tracy
3:00 look at the fish (no feed)
3:05 look at the fish (no feed)
3:10 feed fish
3:25 feed fish
Fish over eat!
6363
Problem with Solution #0
Over eat occurred because:– They execute the same code at the same time
– i.e. if (noFeed) feed fish
This is called race– Two or more threads try to access the same shared resource at the same time
6464
Critical Section
The shared resource or code is called a critical section– the region where race condition occurs
Access to it must be coordinated– i.e. only once process can access it at a time
6565
Critical Section
In solution #0, critical section is – “if (noFeed) feed fish”
Peter and Tracy must NOT be inside the critical section at the same time
6666
1st Type of Synchronization
Ensure only one person in critical section– i.e. only 1 person goes shopping at a time
This is called mutual exclusion:– only 1 thread is doing a certain thing at one time
– others are excluded
7373
Gold Fish Solution #1
Idea: – leave note that you’re going to check on the fish status, so other person doesn’t also feed
7474
Solution #1
Peter: Tracy:
if (noNote) { if (noNote) {
leave note leave note
if (noFeed) { if (noFeed) {
feed fish feed fish
} }
remove note remove note
} }
7575
Solution #1
Does this work?
If not, when could it fail?
Is solution #1 better than solution #0?
7676
Solution #2
Idea: change the order of “leave note” and “check note”
This requires labeled notes– otherwise you’ll see your own note
– and think it was the other person’s note
7777
Solution #2
Peter: Tracy:
leave notePeter leave noteTracy
if (no noteTracy) { if (no notePeter) {
if (noFeed) { if (noFeed) {
feed fish feed fish
} }
} }
remove notePeter remove noteTracy
7878
Solution #2
Does this work? – Yes it solves the over eat problem
If not, when could it fail?– But it introduces starvation problem
7979
Solution #3
Idea: – have a way to decide who will feed fish when both leave notes at the same time.
Approach:– Have Peter hang around to make sure job is done
8080
Solution #3
Peter: Tracy:
leave notePeter leave noteTracy
while (noteTracy) {
do nothing
}
if (no notePeter) {
if (noFeed) { if (noFeed) {
feed fish feed fish
} }
}
remove notePeter remove noteTracy
8181
Solution #3
Peter’s “while (noteTracy)” prevents him from running his critical section at same time as Tracy’s
It also prevents Peter from running off without make sure that someone feeds fish
8585
Solution #3
Correct, but ugly– Complicated (non-intuitive) to prove correct
Asymmetric – peter and Tracy runs different code
– This makes coding difficult
8686
Solution #3
Wasteful– Peter consumes CPU time while waiting for Tracy to remove note
Constantly checking some status while waiting on something is called busy-waiting– Very very bad
8787
Higher-Level Synchronization
What is the solution?– Raise the level of abstraction
Make life easier for programmers
8888
Higher-Level synchronization
Low-level atomic operations provide by hardware(i.e. load/store, interrupt enable/disable, test & set)
High level synchronization operations providedby software (i.e. semaphore, lock, monitor)
Concurrent programs
8989
Lock (Mutex)
A Lock prevents another thread from entering a critical section– e.g. lock the fridge while you’re shopping so that both Peter and Tracy don’t go shopping
9090
Lock (Mutex)
Two operations– lock(): wait until lock is free, then acquire it
– do {
– if (lock is free) {
– acquire lock; break
– }
– } while (1)
unlock(): release lock
9191
Lock (Mutex)
Why was the “note” in Gold Fishsolutions #1 and #2 not a good lock?
Does it meet the 4 conditions?
9292
Four Elements of Locking
Lock is initialized to be free
Acquire lock before entering critical section
Release lock when exiting critical section
Wait to acquire lock if another thread already holds it
9393
Solve “Too much fish” with lock
Peter: Tracy:
lock() lock()
if (noFeed) { if (noFeed) {
feed fish feed fish
} }
unlock() unlock()
9494
Solve “Too much fish” with lock
But this prevents Tracy from doing stuff while Peter is shopping.– I.e. critical section includes the shopping time.
How to minimize the time the lock is held?
9595
Producer-Consumer Problem
Problem: – producer puts things into a shared buffer
– Consumer takes them out
– Need synchronization for coordinating producer and consumer
producer consumer
9696
Producer-Consumer Problem
Buffer between producer and consumer allows them to operate somewhat independently
9797
Producer-Consumer Problem
Otherwise must operate in lockstep– producer puts 1 thing in buffer, then consumer takes it out
– then producer adds another, then consumer takes it out, etc
9898
Producer-Consumer Problem
Coke machine example– delivery person (producer) fills machine with cokes
– students (consumer) feed cokes and drink them
– coke machine has finite space (buffer)
9999
Solution for PC Problem
What does producer do when buffer full?
What does consumer do when buffer empty?
The busy waiting solution in Solution 3 is unacceptable
100100
Solution for PC Problem Use
Use Sleep and Wakeup primitives
Consumer sleeps if buffer is empty– Wake up sleeping producers otherwise
Producer sleeps if buffer is full– Wake up sleeping consumers otherwise
101101
Sleep and Wakeup
#define N 100 /*# of slots in the buffer*/
Int count=0; /*# of items in the buffer*/
Void producer(void)
{ Int item;
While(TRUE) {
Item=produce_item();
If(count==N) sleep();
Insert_item(item);
Count=count+1;
If(count==1) wakeup(consumer);
}
}
102102
Sleep and Wakeup
Void consumer(void)
{ Int item;
While(TRUE) {
If(count==0) sleep();
Item=remove_item();
Count=count-1;
If(count==N-1) wakeup(producer);
Consume_item(item);
}
}
103103
What are the problem?
Producer-consumer problem with fatal race condition– Access to “count” is unrestrainted
– Wakeup call could get lost
Solution is to use semaphore
104104
Semaphore
Semaphores are like a generalized lock
It has a non-negative integer value
Semaphore supports the following ops:– down, up
105105
Down Operation
Wait for semaphore to become positive
Then decrement semaphore by 1– Originally called “P” operation for the Dutch proberen
106106
Up Operation
Increment semaphore by 1– originally called “V”, for the Dutch verhogen
This wakes up a thread waiting in down– if there are any
107107
Semaphores
Can also set the initial value of semaphore
The key parts in down() and up() are atomic
Two down() calls at the same time can’t decrement the value below 0
108108
Binary Semaphores
Value is either 0 or 1
down() waits for value to become 1– then sets it to 0
up() sets value to 1– waking up waiting down (if any)
109109
Initial value is 1 (or more generally, N)– down()
– <critical section>
– up()
Like lock/unlock, but more general
Mutual Exclusion w/ Semaphore
110110
Semaphores for Ordering
Usually (not always) initial value is 0
Example: thread A wants to wait for thread B to finish before continuing– Semaphore initialized to 0
– A B
– down() do task
– continue execution up()
111111
PC Problem with Semaphores
Let’s solve the producer-consumer problem in semaphores
112112
Semaphore Assignments
mutex: – ensures mutual exclusion around code that manipulates buffer queue (initialized to 1)
full: – counts the # of full buffers (initialized to 0)
113113
Semaphore Assignments
empty: – counts the number of empty buffers (initialized to N)
Why do we need different semaphores for full and empty buffers?
114114
Solve PC with Semaphores
#define N 100 /*# of slots in the buffer*/
typedef int semaphore; /*a special kind of int*/
semaphore mutex =1; /*controls access to critical */
semaphore empty =N; /*counts empty buffer slots*/
int full = 0; /*counts full buffer slots */
115115
Solve PC with Semaphores
void producer(void)
{ int item;
while(TRUE) {
item=produce_item();
down(&empty);
down(&mutex);
insert_item(item);
up(&mutex)
up(&full);
}
}
116116
Solve PC with Semaphores
void consumer(void)
{ int item;
while(TRUE) {
down(&full);
down(&mutex);
item=remove_item();
up(&mutex);
up*(&empty);
consume_item(item);
}
}
117117
Semaphore Assignments
Does the order of the down() function calls matter in the consumer (or the producer)?
Does the order of the up() function calls matter in the consumer (or the producer)?
118118
Solve PC with Semaphores
What (if anything) must change to allow multiple producers and/or multiple consumers?
What if there’s 1 full buffer, and multiple consumers call down(full) at the same time?
119119
Problem with Semaphore
Is there any problem with semaphore?
The order of down and up ops are critical– Improper ordering could cause deadlock
– i.e. programming in semaphore is tricky
How to make programming easier?
120120
Monitor
A programming language construct
A collection of procedures, variables, data structures
Access to it is guaranteed to be exclusive– By the compiler (not programmer)
121121
Monitor
Monitors use separate mechanisms for the two types of synchronization
use locks for mutual exclusion
use condition variables for ordering constraints
122122
Monitor
A monitor = a lock + the condition variables associated with that lock
123123
Condition Variables
Main idea: – make it possible for thread to sleep inside a critical section
Approach:– by atomically releasing lock, putting thread on wait queue and go to sleep
124124
Condition Variables
Each variable has a queue of waiting threads– threads that are sleeping, waiting for a condition
Each variable is associated with one lock
125125
Monitors
Monitor example
integer i;
condition c;
procedure producer();
…
end;
procedure consumer();
…
end
end monitor
126126
Ops on Condition Variables
wait(): – atomically release lock
– put thread on condition wait queue, go to sleep
– i.e. start to wait for wakeup
127127
Ops on Condition Variables
signal():– wake up a thread waiting on this condition variable if any
broadcast():– wake up all threads waiting on this condition variable if any
128128
Ops on Condition Variables
Note that thread must be holding lock when it calls wait() or signal()
To avoid problems when both threads are inside monitor– the signal() must be the last statement
129129
Ops on Condition Variables
What to do when a thread wakes up?
Two options:
Let the wakeup thread run– i.e. signaler release the lock
Let the caller and callee compete for lock
130130
Mesa vs. Hoare Monitors
The first option is called Hoare Monitor – It gives special priority to the woken-up waiter
– signaling thread gives up lock
– woken-up waiter acquires lock
– signaling thread re-acquires lock after waiter unlocks
131131
Mesa vs. Hoare Monitors
The second option is Mesa Monitor– when waiter is woken, it must contend for the lock with other threads
– hence must re-check condition
– Whoever wins get to run
132132
Mesa vs. Hoare Monitors
We’ll stick with Mesa monitors– as do most operating systems
– Because it is more flexible
133133
Programming with Monitors
List shared data needed to solve problem
Decide how many locks are needed
Decide which locks will protect which data
134134
Programming with Monitors
More locks allows different data to be accessed simultaneously– i.e. protecting finer-grained data
– but is more complicated
One lock usually enough in this class
137137
Programming with Monitors
Call wait() when thread needs to wait for a condition to be true;
Use a while loop to re-check condition after wait returns
Call signal when a condition changes that another thread might be interested in
138138
Producer-Consumer in Monitor
Variables:
numCokes (number of cokes in machine)
One lock to protect this shared data– cokeLock
fewer locks make the programming simpler – but allow less concurrency
139139
Producer-Consumer in Monitor
Ordering constraints:
Consumer must wait for producer to fill buffer if all buffers are empty
Producer must wait for consumer to empty buffer if buffers is completely full
140140
Producer-Consumer in Monitor
monitor ProducerConsumer condition full, empty; integer count; procedure insert (item: integer); begin if count == N then wait(full); insert_item(item); count++; if count ==1 then signal(empty); end; function remove: integer; begin if count ==0 then wait(empty); remove = remove_item; count--; if count==N-1 then signal(full); end; count:=0;end monitor;
141141
Producer-Consumer in Monitor
procedure producer;begin while true do begin item = produce_item; ProducerConsumer.insert(item); endend;procedure consumer;begin while true do begin item=ProducerConsumer.remove; consume_item(item); endend;
145145
Condition variables are more flexible than using semaphores for ordering constraints
Condition variables: – can use arbitrary condition to wait
Semaphores: – wait if semaphore value == 0
Condition Variable vs. Semaphore
147147
Problems with Monitor
Many languages do not support
Don’t resolve problem when multiple CPUs or computer are involved– This applies to semaphore too
Solution?– Message passing
148148
Message Passing
A high-level primitive for IPC
It uses two primitives: send and receive– Send(destination, &message)
– Receive(source, &message)
They are system calls (not language constructs)
Can either block or return immediately
149149
Message Passing
#define N 100 /*# of slots in the buffer*/Void
producer(void)
{ Int item;
message m; /*message buffer*/
While(TRUE) {
Item=produce_item();
receive(consumer, &m); /*wait for an empty */
Build_messasge(&m, item); /*construct a msg to*/
Send(consumer, &m);
}
}
150150
Message Passing
Void consumer(void)
{ Int item;
message m;
for(i=0;i<N;i++) send(producer, &m); /* send N empties */
While(TRUE) {
receive(producer, &m);
item=extract_item(&m);
send(producer, &m);
consume_item(item);
}
}
151151
Problems with Message Passing
Have many challenging problems
Message loss– What happens when a message is lost
Authentication– How to determine the identity of the sender?
Performance
152152
Barriers
Another synchronization primitive
Intended for a group of processes
All processes must reach the barrier for the applications to proceed to next phase
153153
Barriers
– processes approaching a barrier
– all processes but one blocked at barrier
– last process arrives, all are let through
154154
Implementing Locks
So far we have used locks extensively
We assumed that lock operations are atomic
But how atomicity of lock is implemented?
155155
Implementing Locks
Low-level atomic operations provide by hardware(i.e. load/store, interrupt enable/disable, test & set)
High level synchronization operations providedby software (i.e. semaphore, lock, monitor)
Concurrent programs
Lock must be implemented on hardware ops
156156
Use Interrupt Disable/Enable
On uniprocessor, operation is atomic as long as – context switch doesn’t occur in middle of operation
How does thread get context switched out?– interrupt
Prevent context switches at wrong time by preventing these events
157157
Use Interrupt Disable/Enable
With interrupt disable/enable to ensure atomicity, – why do we need locks?
User program could call interrupt disable – before entering critical section
– and call interrupt enable after leaving critical section
– and make sure not to call yield in the critical section
158158
Lock Implementation #1
Disable interrupts with busy waiting
159159
Lock Implementation #1
lock() {
disable interrupts
while (value != FREE) {
enable interrupts
disable interrupts
}
value = BUSY
enable interrupts
}
Why does lock() disable interrupts in the beginning of the function?
Why is it ok to disable interrupts in lock()’s critical section
it wasn’t ok to disable interrupts while user code was running?
160160
Lock Implementation #1
unlock() {
disable interrupts
value = FREE
enable interrupts
}
Do we need to disable interrupts in unlock()?
161161
Problems with Interrupt Approach
Interrupt disable works on a uniprocessor– by preventing current thread from being switched out
But this doesn’t work on a multi-processor
Disabling interrupts on one processor doesn’t prevent other processors from running
Not acceptable (or provided) to modify interrupt disable to stop other processors from running
162162
Read-modify-write Instructions
Another atomic primitive
Use atomic load / atomic store instructions – remember Gold Fishsolution #3
163163
Read-modify-write Instructions
Modern processors provide an easier way– with atomic read modify-write instructions
Read-modify-write atomically – reads value from memory into a register
– Then writes new value to that memory location
164164
Read-modify-write Instructions
test_and_set– atomically writes 1 to a memory location (set)
– and returns the value that used to be there (test)
test_and_set(X) {
tmp = X
X = 1
return(tmp)
}
165165
Lock Implementation #2
Test & set with busy waiting(value is initially 0)
lock() {
while (test_and_set(value) == 1) {}
}
unlock() {
value = 0
}
166166
Lock Implementation #2
If lock is free (value = 0)– test_and_set sets value to 1 and returns 0,
– so the while loop finishes
If lock is busy (value = 1)– test_and_set doesn’t change the value and returns 1,
– so loop continues
167167
Strategy for Reducing Busy-Waiting
In method 1 & 2, Waiting thread uses lots of CPU time just checking for lock to become free
Better for it to sleep and let other threads run
168168
Lock Implementation #3
Interrupt disable, no busy-waiting
Waiting thread gives up processor so that other threads (e.g. thread with lock) can run more quickly
Someone wakes up thread when the lock is free
169169
Lock Implementation #3
lock() {
disable interrupts
if (value == FREE) {
value = BUSY
} else {
add thread to queue of threads waiting for this lock
switch to next runnable thread
}
enable interrupts
}When should lock() re-enable interrupts before calling switch?
170170
Lock Implementation #3
unlock() {
disable interrupts
value = FREE
if (any thread is waiting for this lock) {
move waiting thread from waiting queue to ready queue
value = BUSY
}
enable interrupts
}
172172
Interrupt Disable/Enable Pattern
Enable interrupts before adding thread to wait queue?
lock() {
disable interrupts
...
if (lock is busy) {
enable interrupts
add thread to lock wait queue
switch to next run-able thread
}
When could this fail?
173173
Interrupt Disable/Enable Pattern
Enable interrupts after adding thread to wait queue, but before switching to next thread?
174174
Interrupt Disable/Enable Pattern
lock() {
disable interrupts
...
if (lock is busy) {
add thread to lock wait queue
enable interrupts
switch to next runnable thread
}
175175
Interrupt Disable/Enable Pattern
But this fails if interrupt happens after thread enable interrupts– lock() adds thread to wait queue
– lock() enables interrupts
– interrupt causes preemption,
– i.e. switch to another thread.
176176
Interrupt Disable/Enable Pattern
Preemption moves thread to ready queue– Now thread is on two queues (wait and ready)!
Also, switch is likely to be a critical section– Adding thread to wait queue and switching to next thread must be atomic
177177
Solution
Waiting thread leaves interrupts disabled – when it calls switch
Next thread to run has the responsibility of– re-enabling interrupts before returning to user code
When waiting thread wakes up– it returns from switch with interrupts disabled from the last thread
178178
Invariant
All threads promise to have interrupts disabled when they call switch
All threads promise to re-enable interrupts after they get returned to from switch
179179
Thread A Thread B yield() { disable
interrupts switch enable interrupts}<user code runs>lock() { disable interrupts ... switch back from switch enable interrupts } <user code runs> unlock() (move
thread A to ready queue) yield() { disable
interrupts switch back from switch enable interrupts}
180180
Lock Implementation #4
Test & set, minimal busy-waiting
Can’t implement locks using test & set without some amount of busy-waiting– but can minimize it
Idea:– use busy waiting only to atomically execute lock code
– Give up CPU if busy
181181
Lock Implementation #4
lock() {
while(test&set(guard)) {
}
if (value == FREE) {
value = BUSY
} else {
add thread to queue of threads waiting for this lock
switch to next runnable thread
}
guard = 0
}
182182
Lock Implementation #4
unlock() {
while (test&set(guard)) {
}
value = FREE
if (any thread is waiting for this lock) {
move waiting thread from waiting queue to ready queue
value = BUSY
}
guard = 0
}
183183
CPU Scheduling
How should one choose next thread to run? What are the goals of the CPU scheduler?
Bursts of CPU usage alternate with periods of I/O wait
184184
CPU Scheduling
(a) CPU-bound process (b) I/O bound process
185185
CPU Scheduling
Minimize average response time– average elapsed time to do each job
Maximize throughput of entire system– rate at which jobs complete in the system
Fairness– share CPU among threads in some “equitable” manner
186186
Scheduling Algorithm Goals
All systems
Fairness:– giving each process a fair share of the CPU
Policy enforcement:– seeing that stated policy is carried out
Balance:– keeping all parts of the system busy
187187
Scheduling Algorithm Goals
Batch systems
Throughput:– maximize jobs per hour
Turnaround time:– minimize time between submission and termination
CPU utilization:– keep the CPU busy all the time
188188
Scheduling Algorithm Goals
Interactive systems– Response time – respond to requests quickly
– Proportionality – meet users’ expectations
Real-time systems– Meeting deadlines – avoid losing data
– Predictability – avoid quality degradation in multimedia systems
189189
FCFS
First-Come, First-Served
FIFO ordering between jobs
No preemption (run until done)– thread runs until it calls yield() or blocks on I/O
– no timer interrupts
190190
FCFS
Pros and cons
+ simple
- short jobs get stuck behind long jobs
- what about the user’s interactive experience?
191191
FCFS
Example– job A takes 100 seconds
– job B takes 1 second
– time 0: job A arrives and starts
– time 0+: job B arrives
– time 100: job A ends (response time=100); job B starts
– time 101: job B ends (response time = 101)
average response time = 100.5
192192
Round Robin
Goal: – improve average response time for short jobs
Solution: – periodically preempt all jobs (viz. long-running ones)
Is FCFS or round robin more “fair”?
193193
Round Robin
Example
job A takes 100 seconds
job B takes 1 second
time slice of 1 second – a job is preempted after running for 1 second
194194
Round Robin
– time 0: job A arrives and starts
– time 0+: job B arrives
– time 1: job A is preempted; job B starts
– time 2: job B ends (response time = 2)
– time 101: job A ends (response time = 101)
average response time = 51.5
195195
Round Robin
Does round-robin always achieve lower response time than FCFS?
196196
Round Robin
Pros and cons
+ good for interactive computing
- round robin has more overhead due to context switches
197197
Round Robin
How to choose time slice?
big time slice: – degrades to FCFS
small time slice: – each context switch wastes some time
198198
Round Robin
typically a compromise– 10 milliseconds (ms)
if context switch takes .1 ms– then round robin with 10 ms slice wastes 1% of CPU
199199
STCF
Shortest Time to Completion First
STCF: run whatever job has the least amount of work to do before it finishes or blocks for an I/O
200200
STCF
STCF-P: preemptive version of STCF
if a new job arrives that has less work than the current job has remaining– then preempt the current job in favor of the new one
201201
STCF
Idea is to finish the short jobs first
Improves response time of shorter jobs by a lot– Doesn’t hurt response time of longer jobs by too much
202202
STCF
STCF gives optimal response time among non-preemptive policies
STCF-P gives optimal response time among preemptive policies (and non-preemptive policies)
203203
STCF
Is the following job a “short” or “long” job?– while(1) {
– use CPU for 1 ms
– use I/O for 10 ms
– }
204204
STCF
Pros and cons
+ optimal average response time
- unfair. Short jobs can prevent long jobs from ever getting any CPU time (starvation)
- needs knowledge of future
205205
STCF
STCF and STCF-P need knowledge of future– it’s often very handy to know the future :-)
– how to find out the future time required by a job?
206206
Example
job A
compute for 1000 seconds
job B
compute for 1000 seconds
job C
while(1) {
use CPU for 1 ms
use I/O for 10 ms
}
207207
Example
C can use 91% of the disk by itself
A or B can each use 100% of the CPU
What happens when we run them together?
Goal: keep both CPU and disk busy
208208
FCFS
if A or B run before C,
they prevent C from issuing its disk I/O for – up to 2000 seconds
209209
Round Robin
with 100 ms time slice– CA---------B---------CA---------B---------...
– |--|
– C’s
– I/O
Disk is idle most of the time that A & B are running – about 10 ms disk time every 200 ms
210210
Round Robin
with 1 ms time slice– CABABABABABCABABABABABC...
– |--------| |--------|
– C’s C’s
– I/O I/O
C runs more often, so it can issue its disk I/O almost as soon as its last disk I/O is done
211211
Round Robin with 1ms
Disk is utilized almost 90% of the time
Little effect on A or B’s performance
General principle: – first start the things that can run in parallel
problem: – lots of context switches (and context switch overhead)
212212
STCF-P
Runs C as soon as its disk I/O is done– because it has the shortest next CPU burst
– CA-------CA-------------CA--------- ...
– |--------| |--------| |---------|
– C’s C’s C’s
– I/O I/O I/O
213213
Real-Time Scheduling
So far, we’ve focused on average-case analysis– average response time, throughput
Sometimes, the right goal is to get each job done before its deadline– irrelevant how much before deadline job completes
214214
Real-Time Scheduling
Video or audio output. – E.g. NTSC outputs 1 TV frame every 33 ms
Control of physical systems– e.g. auto assembly, nuclear power plants
215215
Real-Time Scheduling
This requires worst-case analysis
How do we do this in real life?
216216
EDF
Earliest-deadline first
Always run the job that has the earliest deadline– i.e. the deadline coming up next
If a new job arrives with an earlier deadline than the currently running job– preempt the running job and start the new one
217217
EDF
EDF is optimal– it will meet all deadlines if it’s possible to do so
218218
EDF
Example
job A: takes 15 seconds, deadline is 20 seconds after entering system
job B: takes 10 seconds, deadline is 30 seconds after entering system
job C: takes 5 seconds, deadline is 10 seconds after entering system
219219
EDF
time--->
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85
A +
B +
C +
220220
Not all set of tasks are schedulable in RT Systems
Given– m periodic events
– event i occurs within period Pi and requires Ci seconds
Then the load can only be handled if1
1m
i
i i
C
P
Schedulability in Real-Time Systems
221221
Scheduling in Batch Systems
First come first serve
Shortest job first
Shortest remaining time next
Three-level scheduling
222222
Scheduling in Batch Systems
An example of shortest job first scheduling
223223
Scheduling in Batch Systems
Three level scheduling
224224
Scheduling in Interactive Systems
Round-robin scheduling
Priority scheduling– Run highest priority process until it blocks or exits
– Alternative, decrease priority at each clock tick
Multiple queue– Dividing priority into classes
– Dynamically adjust a process’s priority class
225225
Scheduling in Interactive Systems
Shortest process next– Estimate the running time of processes
Guaranteed scheduling– Each process runs 1/n fraction of time
Lottery scheduling– Issue lottery tickets to process & schedule accordingly
Fair share scheduling per user
226226
Scheduling in Interactive Systems
Round Robin Scheduling– list of runnable processes
– list of runnable processes after B uses up its quantum
227227
A scheduling algorithm with four priority classes
Scheduling in Interactive Systems
228228
Policy versus Mechanism
Separate what is allowed to be done – with how it is done
Important in thread scheduling– a process knows which of its children threads are important and need priority
229229
Policy versus Mechanism
Scheduling algorithm parameterized– mechanism in the kernel
Parameters filled in by user processes– policy set by user process
230230
Thread Scheduling
Possible scheduling of user-level threads
231231
Thread Scheduling
Possible scheduling of kernel-level threads
Thoughts Change Life意念改变生活